C ++ code optimization Summary

Source: Internet
Author: User
C ++ code optimization Summary

I. Before Optimization
Before optimization, we should first find the bottleneck of our code (bottleneck. However, when you do this, do not deduce from a debug-version because the debug-version contains a lot of additional code. A debug-version executable is 40% larger than release-version. The additional code is used to support debugging, such as symbol search. Most implementations provide different operator new and library functions for debug-version and release-version. In addition, the execution body of a release-version may have been optimized in many ways, including the elimination of unnecessary temporary objects, loop expansion, moving objects into registers, and inline.
In addition, we need to separate debugging and optimization. They are doing different tasks. Debug-version is used to hunt down the bugs and check whether the program has logical problems. Release-version is used for performance adjustment and optimization.
Let's take a look at the code optimization technologies below:

Ii. Placement of statements 
Where variables and object declarations in programs are placed will have a significant impact on performance. Similarly, selecting the Postfix and prefix operators also affects performance. In this part, we will focus on four issues: Initialize the value assignment of v. S, place the declaration where the program really wants to use, initialize the list of constructors, and prefix the v. s Postfix operator.
(1) Use initialization instead of value assignment.
In C, variables can only be declared at the beginning of a function body, but can be declared at any position of the program in C ++. The purpose of this operation is to delay the object declaration until it is used. This can have two benefits: 1. Ensure that the object is not maliciously modified by other parts of the program before it is used. If the object is declared at the beginning but used after 20 rows, this guarantee cannot be implemented. 2. this gives us the opportunity to improve performance by replacing the value assignment with initialization. Previously, the Declaration can only be placed at the beginning, but we often didn't get the value we wanted at the beginning, therefore, the benefits of initialization cannot be applied. But now we can initialize it directly when we get the desired value, saving one step. Note: For basic types, there may be no difference between initialization and assignment, but for user-defined types, the two will bring a significant difference, because the assignment will call the handler function more ---- operator.
=. Therefore, initialization should be our first choice when we select between assignment and initialization.
(2) Place the statement in a proper position
In some cases, we should pay enough attention to the performance improvement caused by moving the declaration to a proper position. For example:
Bool is_c_needed ();
Void use ()
{
C C1;
If (is_c_needed () = false)
{
Return; // C1 was not needed
}
// Use C1 here
Return;
}
In the above Code, object C1 will be created even if it is possible not to be used, so that we will pay unnecessary costs for it, you may say how much time a object C1 can waste, but in this case: C C1 [1000]; I don't think it is a waste. However, we can change this situation by moving the position of C1:
Void use ()
{
If (is_c_needed () = false)
{
Return; // C1 was not needed
}
C C1; // moved from the block's beginning
// Use C1 here
Return;
}
How is the program performance greatly improved? Therefore, please carefully analyze your code and place the Declaration in a proper position. The benefits of this Declaration are hard to imagine.
(3) initialization list
We all know that the initialization list is generally used to initialize a const or reference data member. However, due to its own nature, we can use the initialization list to improve performance. Let's first look at a program:
Class person
{
PRIVATE:
C C_1;
C C_2;
Public:
Person (const C & C1, const C & C2): c_1 (C1), C_2 (C2 ){}
};
Of course, the constructor can also be written as follows:
Person: Person (const C & C1, const C & C2)
{
C_1 = C1;
C_2 = c2;
}
So what kind of performance difference will the two bring about? To solve this problem, we must first understand how the two are executed. First, let's look at the initialization list: the declared operations of data members are completed before the constructor is executed. In constructor, only the value assignment operation is often completed, however, the initialization list is initialized directly when the data member is declared, so it only executes copy constructor once. Let's look at the assignment in the constructor: first, the constructor will be used to create data members before the constructor is executed, and then assign values through operator = In the constructor. Therefore, the callback function is called more than the initialization list. The performance difference comes out. However, please note that if your data members are of the basic type, do not use the initialization list for program readability, because the compiler generates the same assembly code for both.
(4) Postfix vs prefix Operator
The prefix operator ++ and-are more efficient than the Postfix version, because when the Postfix operator is used, a temporary object is required to save and change the previous value. For basic types, the compiler will remove this copy, but this does not seem possible for user-defined types. Therefore, use the prefix operator whenever possible.

Iii. inline functions
Inline functions not only remove the efficiency burden caused by function calls, but also retain the advantages of general functions. However, inline functions are not a panacea, and in some cases, they can even reduce program performance. Therefore, you should be cautious when using it.
1. let's take a look at the benefits of inline functions: from a user's perspective, inline functions look like common functions. They can have parameters, return values, and scopes, however, it does not introduce the burden of General function calls. In addition, it is safer and easier to debug than macros.
Of course, we should be aware that inline specifier is only a suggestion for the compiler, And the compiler has the right to ignore this suggestion. How does the compiler determine whether the function is inline or not? Generally, key factors include the size of the function body, whether a local object is declared, and the complexity of the function.
2. What will happen if a function is declared as inline but not inline? Theoretically, when the compiler refuses to inline a function, the function will be treated like a common function, but there will be some other problems. For example, the following code:
// Filename time. h
# Include <ctime>
# Include <iostream>
Using namespace STD;
Class time
{
Public:
Inline void show () {for (INT I = 0; I <10; I ++) cout <time (0) <Endl ;}
};
Because the member function time: Show () includes a local variable and a for loop, the compiler generally rejects inline and treats it as a common member function. However, the header file containing the class declaration will be separately # include into each independent compilation unit:
// Filename f1.cpp
# Include "time. Hj"
Void F1 ()
{
Time T1;
T1.show ();
}

// Filename f2.cpp
# Include "time. H"
Void F2 ()
{
Time t2;
T2.show ();
}
The result compiler generates two copies of the same member functions for this program:
Void F1 ();
Void F2 ();
Int main ()
{
F1 ();
F2 ();
Return 0;
}
When the program is linked, the linker will face two copies of the same time: Show (), so the function re-defines the connection error. However, some old c ++ implementations can deal with this situation by treating an un-inlined function as static. Therefore, each copy of the function is only visible in the compilation unit, so that the Link error is solved, but multiple copies of the function will be left in the program. In this case, the program performance is not improved, but the compilation and link time and the size of the final executable body are increased.
Fortunately, the UN-inlined function in the new C ++ standard has changed. A c ++ implementation must generate only one copy of the function. However, it may take a long time for all compilers to support this.
There are two more headaches about inline functions. The first question is how to perform maintenance. A function may appear in an internal form at the beginning, but as the system expands, the function body may require additional functions, and the result inline function becomes unlikely, therefore, you need to remove the inline specifier and put the function body in a separate source file. Another problem is that inline functions are generated when they are applied to the code library. When the inline function changes, the user must recompile their code to reflect this change. However, for a non-inline function, you only need to relink it.
What I want to talk about here is that inline functions are not a panacea for improving performance. Only when a function is very short can it get the desired effect, but if the function is not very short and called in many places, it will increase the size of the executable body. The most annoying thing is when the compiler rejects inline. In the old implementation, the results were unsatisfactory. Although there were significant improvements in the new implementation, they were still not so perfect. Some compilers can be smart enough to point out which functions can be inline and which cannot, but most compilers are less intelligent. Therefore, we need our experience to judge. If the inline function cannot enhance the row performance, avoid using it!


4. Optimize your memory usage
Optimization usually involves several aspects: faster running speed, effective use of system resources, and smaller memory usage. In general, code optimization attempts to improve the above aspects. The replacement declaration technique has proved to be the elimination of redundant object creation and destruction, which reduces the program size and speeds up operation. However, other optimization technologies are based on one aspect-faster speed or smaller memory usage. Sometimes, these goals are mutually exclusive, and the use of compressed memory often slows down the code speed, but quick code requires more memory support. The following summarizes two Optimization Methods for memory usage:
1. Bit Fields
The smallest unit of data access and access in C/C ++: bit. Because bit is not a basic access unit for C/C ++, it is used to reduce memory and auxiliary memory space at the expense of running speed. Note: Some hardware structures may provide special processor commands to access bit. Therefore, whether bit fields affects the program speed depends on the specific platform.
In our real life, many places of data are wasted, because some applications do not have such a large data range. Maybe you will say that bit is so small that it can reduce the use of storage space? Indeed, it won't show any effect when the data volume is small, but the space it saves can make our eyes shine when the data volume is astonishing. Maybe you will say that the memory and hard disk are getting cheaper and cheaper now, so it will take up to half a day to save a few dollars. But there is another reason that will make you convinced: digital information transmission. A Distributed Database has multiple copies in different locations. The transmission of millions of records will become very expensive. OK. Now let's take a look at how to do it. First, let's look at the following code:
Struct billingrec
{
Long cust_id;
Long timestamp;
Enum calltype
{
Toll_free,
Local,
Regional,
Long_distance,
International,
Cellular
} Type;
Enum calltariff
{
Off_peak,
Medium_rate,
Peak_time
} Tariff;
};
The above struct will occupy 16 bytes on 32-bit machines, and you will find many of them are wasted, especially the two Enum types, therefore, see the following improvements:
Struct billingrec
{
Int cust_id: 24; // 23 bits + 1 sign bit
Int timestamp: 24;
Enum calltype
{//...
};
Enum calltariff
{//...
};
Unsigned call: 3;
Unsigned tariff: 2;
};
Now a data size is reduced from 16 bytes to 8 bytes, Which is halved. How can this problem be solved :)
2. Unions
Unions reduces memory waste by placing two or more data members in the same address memory, which requires that only one data member can be valid at any time. Union can have member functions, including constructor and destructor, but it cannot have virtual functions. C ++ supports anonymous unions. Anonymous union is an unnamed object. For example:
Union {long n; void * p}; // Anonymous
N = 1000l; // members are directly accessed
P = 0; // n is now also 0
Unlike the named union, it cannot have member functions or non-public data members.
So when does unions work? The following class obtains a person's information from the database. The keyword can be a unique ID or name, but the two cannot be both valid:
Class personaldetails
{
PRIVATE:
Char * Name;
Long ID;
//...
Public:
Personaldetails (const char * nm); // key is of type char * used
Personaldetails (long ID): ID (ID) {}// numeric key used
};
The above code will cause a waste of memory, because only one keyword is valid at a time. Anonymous union can be used here to reduce memory usage, for example:
Class personaldetails
{
PRIVATE:
Union // Anonymous
{
Char * Name;
Long ID;
};
Public:
Personaldetails (const char * nm );
Personaldetails (long ID): ID (ID) {/**/} // direct access to a member
//...
};
By using union, the size of the personaldetails class is halved. However, it should be noted that saving 4 bytes of memory is not worth the trouble of introducing Union, unless this class is used as the type or record of millions of databases for transmission on a very slow communication line. It is worth noting that unions does not introduce any runtime burden, so there will be no speed loss here. Anonymous union has the advantage that its members can be directly accessed.
5. Speed Optimization
In some application systems that require extremely high speed, each CPU cycle is required. This section shows some simple methods for speed optimization.
1. Use the class to wrap the long parameter list
The burden of a function call will increase with the increase of the parameter list. During runtime, the system has to create a stack to store parameter values. Generally, when there are many parameters, such an operation will take a long time.
Wrap the parameter list into a separate class and pass it through references, which saves a lot of time. Of course, if the function itself is very long, the stack creation time can be ignored, so there is no need to do so. However, for functions that have a short execution time and are often called, encapsulating a long parameter list in an object and passing it through reference will improve performance.
2. register variables
Register specifier is used to tell the compiler that an object will be used a lot and can be put into registers. For example:
Void F ()
{
Int * P = new int [3000000];
Register int * P2 = P; // store the address in a register
For (register Int J = 0; j <3000000; j ++)
{
* P2 ++ = 0;
}
//... Use P
Delete [] P;
}
Cyclic counting is the best candidate for applying register variables. When they are not stored in a register, most of the cycle time is used to retrieve variables from the memory and to assign new values to the variables. If you store it in a register, this burden will be greatly reduced. It should be noted that register specifier is only a suggestion for the compiler. Just like an inline function, the compiler can refuse to store an object in a register. In addition, modern compilers optimize cyclic counting by putting variables into registers. Register storage specifier is not limited to basic types, but can be applied to any type of objects. If the object is too large to fit into a register, the compiler can still put it into a high-speed memory, such as cache.
Using register storage specifier to declare function parameters is recommended for the compiler to store real parameters into registers rather than stacks. For example:

Void F (register Int J, register date D );

3. Declare the objects that remain unchanged as const
By declaring an object as const, the compiler can use this Declaration to put such an object into a register.
4. virtual function runtime burden
When a virtual function is called, if the compiler can solve the static call, no additional burden will be introduced. In addition, a very short virtual function can be processed inline. In the following example, a smart compiler can call virtual functions statically:
# Include <iostream>
Using namespace STD;
Class V
{
Public:
Virtual void show () const {cout <"I'm v" <Endl ;}
};
Class W: Public v
{
Public:
Void show () const {cout <"I'm W" <Endl ;}
};
Void F (V & V, V * PV)
{
V. Show ();
PV-> show ();
}
Void g ()
{
V;
F (v, & V );
}
Int main ()
{
G ();
Return 0;
}
If the entire program appears in a separate compilation unit, the compiler can replace g () in main () inline. In addition, F () calls in G () can also be processed inline. Because the dynamic types of parameters passed to F () can be known during compilation, the compiler can statically call virtual functions. But it cannot be guaranteed that every compiler does this. However, some compilers can indeed obtain the dynamic type of parameters during the compilation period, so that the function call is determined during the compilation period, avoiding the burden of dynamic binding.
5. Function objects vs function pointers
The benefits of replacing function pointers with function objects are not limited to generalization and simple maintainability. In addition, the compiler can perform inline processing on function calls of function objects, further enhancing the performance.
6. Final help
The optimization technologies presented to you so far have not been compromised in terms of design and code readability. In fact, some of them also improve the stability and maintainability of the software. However, in software development with strict time and memory restrictions, the above technology may not be enough; some technologies that may affect software portability and scalability may also be required. However, these technologies can only be used when all other optimization technologies are applied but not meet the requirements.
1. Disable rtti and Exception Handling Support
When you import pure C code to the C ++ compiler, you may find some performance loss. This is not a language or compiler error, but an adjustment made by the compiler. If you want to achieve the same performance as the C compiler, disable the compiler's support for rtti and exception handling. Why? To support rtti and exception handling, the C ++ compiler inserts additional code. This increases the size of the executable body, reducing the efficiency. When the pure C code is applied, additional code is not required, so you can avoid it by disabling it.
2. inline assembly
The time-demanding part can be rewritten using local assembly. The result may be a significant increase in speed. However, this method cannot be implemented on your own because it will make future changes very difficult. Programmers who maintain code may not understand assembly. If you want to run the software on another platform, you also need to rewrite the assembly code section. In addition, it takes a longer time to develop and test the assembly code.
3. directly interact with the operating system
API functions allow you to directly interact with the operating system. Sometimes, directly executing a system command may be much faster. For this purpose, you can use the standard function system (). For example, in a DOS/Windows system, you can display the files in the current directory as follows:
# Include <cstdlib>
Using namespace STD;
Int main ()
{
System ("dir"); // execute the "dir" command
}
Note: Here is a compromise between speed, portability, and scalability.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.