Item16. remember the 80-20 criterion:
80-20 Rules indicate that about 20% of Code uses 80% of program resources, about 20% of code consumes about 80% of running time, and about 20% of Code uses 80% of memory; about 20% of the Code executes 80% of disk access; 80% of maintenance costs about 20% of the Code; this rule has been verified by numerous machines, operating systems, and applications. The 80-20 criterion is not only a well-remembered idiom, but also a guideline on system performance. It has extensive applicability and a solid experimental foundation.
The overall performance of the software usually depends on a small part of the code in the program.
The correct method is to use the profiler program to identify part 1 of the annoying program. Not all work is done by Profiler. You want it to directly measure the resources you are interested in. Remember that profiler can only tell you the status of a program running at a certain time (or several times), So if you use a program with a non-representative input data profile, the profile you are running does not have a representative type. On the contrary, this may lead you to optimize the less commonly used software behavior, while in the field of common software, the overall efficiency of the software is opposite (that is, the efficiency is reduced ).
To prevent such incorrect results, the best way is to use as many data profiles as possible for your software.
Item17. consider using latency calculation:
The key is to be lazy.
The following are four application scenarios:
Reference counting)
Lazy evaluation: do not create a copy as soon as it is needed. We should be lazy and share others' values as long as possible.
Differentiate reading and writing operator []
You can use the proxy class in item30 to achieve this.
Lazy fetching)
Suppose your program uses large objects that contain many fields. The lifetime of these objects exceeds the runtime, so they must be stored in the database. Each object has a unique object identifier, which is used to obtain objects from the database: Because the largeobject object instance is large, it obtains all the data for such objects, the database operation overhead will be very large, especially when data is obtained from a remote database and sent over the network. In this case, you do not need to read all the data.
When a largeobject object is created, it creates an object "shell" instead of reading all the data from the disk. When a data is needed, the data is retrieved from the database. The implementation method of this "demand-paged" object initialization is
Class largeobject {<br/> Public: <br/> largeobject (objectid ID); </P> <p> const string & field1 () const; <br/> int field2 () const; <br/> double field3 () const; <br/> const string & field4 () const; <br/>... </P> <p> PRIVATE: <br/> objectid OID; </P> <p> auto_ptr <string> pfield1value; <br/> auto_ptr <int> pfield2value; <br/> auto_ptr <double> pfield3value; <br/> auto_ptr <string> pfield4value; <br/>... </P> <p >}; </P> <p> largeobject: largeobject (objectid ID) <br/>: oid (ID) {}</P> <p> const string & largeobject: field1 () const <br/>{< br/> If (field1value = 0) {<br/> read data from the database for Filed 1 and point <br/> field1value to this value; <br/>}</P> <p> return * field1value; <br/>}
Lazy expression evaluation)
Template <class T> <br/> class Matrix {...}; // for homogeneous matrices </P> <p> matrix <int> m1 (0, 1000,100 ); // a 1000*1000 matrix <br/> matrix <int> m2 (1000,100 0); // same as above </P> <p>... </P> <p> matrix <int> m3 = m1 + m2; // M1 + m2
A data structure should be established to indicate that the values of M3 are the sum of M1 and M2, and an Enum is used to indicate that they are addition operations. Obviously, building this data structure is much faster than adding m1 and m2, which can also save a lot of memory. Before using m3, the code is executed as follows:
Matrix <int> M4 (1000,100 0 );
... // Assign some M4 values
M3 = M4 * m1;
Now we can forget that m3 is the sum of M1 and M2 (which saves computing overhead). Here we should remember that m3 is the result of M4 and M1 operations. Needless to say, we will not perform this multiplication immediately.
A more common application field is when we only need to calculate a portion of the results. For example, suppose we initialize the sum of M3 values m1 and m2, and then use m3 as follows:
Cout <m3 [4]; // print the fourth line of M3
Obviously, we cannot be lazy anymore. We should calculate the value of the fourth line of M3. However, we do not need to spend too much effort, and there is no reason to calculate the results out of line 4 of m3; the rest of M3 remains uncalculated until they are actually required.
Item18. expected computing overhead for installment:
Eager evaluation is calculated when a function is called. Over-eager evaluation is expected to design a Data Structure to efficiently process these computing requirements when a computing is frequently called, which can reduce the overhead of each computing requirement.
The following program uses the map object in the standard template library (STL) as the local cache, so that you do not need to query from the database every time.
Int findcubiclenumber (const string & employeename) <br/>{</P> <p> // defines static map, storage (employee name, cubicle number) <br/> // pairs. this map is a local cache. </P> <p> typedef Map <string, int> cubiclemap; <br/> static cubiclemap cubes; </P> <p> // try to find an entry for employeename in the cache; <br/> // the STL iterator "it" will then point to the found <br/> // entry, if there is one (see item 35 for details) </P> <p> cubiclemap: iterator it = cubes. find (employeename); </P> <p> // "it"'s value will be cubes. end () If no entry was <br/> // found (this is standard STL behavior ). if this is <br/> // The case, consult the database for the cubicle <br/> // number, then add it to the cache </P> <p> If (IT = cubes. end () {<br/> int cubicle = <br/> the result of looking up employeename's cubicle <br/> number in the database; </P> <p> cubes [employeename] = cubicle; // Add the pair <br/> // (employeename, cubicle) <br/> // to the cache <br/> return cubicle; <br/>}< br/> else {<br/> // "it" points to the correct cache entry, which is a <br/> // (employee name, cubicle number) pair. we want only <br/> // the second component of this pair, and the member <br/> // "second" will give it to us </P> <p> return (* it ). second; <br/>}< br/>}
(There is a code detail that needs to be explained. The last statement returns (* It). Second instead of common IT-> second. Why? The answer is to follow the STL rules. In short, iterator is an object, not a pointer, so it cannot be guaranteed that "->" is correctly applied to it. However, STL explicitly requires that "." and "*" are valid on iterator, so (* It). Second is complex in syntax but can be run .)
(Thy: The time for changing the space. For example, you can calculate the city distance matrix in advance in the TSP problem)
Item19. learn about the source of the temporary object:
Real temporary objects in C ++ are invisible, and they do not appear in your source code. A non-heap object without a name is a temporary object. This unnamed object is usually generated under two conditions: implicit type conversion and function return objects for successful function calls.
This type conversion occurs only when an object is passed by passing a value or the reference-to-const parameter. When a reference-to-non-const parameter object is passed, this will not happen.
Whenever you see the reference-to-const parameter, there is a possibility of creating a temporary object and binding it to the parameter. A temporary object will be created (released later) whenever you see the returned object of the function ).
Item20. assists the compiler in optimizing return values:
Believe me: some functions (operator * is also in it) must return objects. This is how they run. Don't fight against it. You won't win.
You should focus on guiding your efforts to find and reduce the overhead of the returned object, rather than eliminating the object itself.
The trick is to return constructors with parameters instead of directly returning objects:
Const rational operator * (const rational & LHS, <br/> const rational & RHs) <br/>{< br/> return rational (LHS. numerator () * RHS. numerator (), <br/> LHS. denominator () * RHS. denominator (); <br/>}</P> <p>
Use this expression to create a temporary rational object,
Rational (LHS. numerator () * RHS. numerator (),
LHS. denominator () * RHS. denominator ());
The Return Value of the function is a copy of the temporary object on this person. The C ++ rule allows the compiler to optimize temporary objects beyond the lifecycle (temporary objects out of existence ).
Rational c = a * B; // call operator * here *
The compiler is allowed to remove the temporary variables in operator * and the temporary variables returned by operator. They can construct the objects defined by the return expression in the memory allocated for the target c. If your compiler does this, the overhead of the temporary object that calls operator * is zero: no temporary object is created.
Item21. avoid implicit type conversion through function overload:
Class upint {// unlimited precision <br/> Public: // integers class <br/> upint (); <br/> upint (INT value ); <br/>... <br/>}; <br/> // explain why the returned value is const, see Objective C ++ Clause 21 <br/> const upint operator + (const upint & LHS, const upint & RHs); <br/> upint upi1, upi2; <br/>... <br/> upint upi3 = upi1 + upi2;
Consider the following statements:
Upi3 = upi1 + 10;
Upi3 = 10 + upi2;
These statements can also be run successfully. By creating a temporary object, convert the integer number 10 to upints. There is also a way to successfully call the operator hybrid type, which will eliminate the need for implicit type conversion. If we want to add upint and INT objects, declare the following functions to achieve this goal. Each function has a different set of parameter types.
Const upint operator + (const upint & LHS, // Add upint <br/> const upint & RHs ); // and upint <br/> const upint operator + (const upint & LHS, // Add upint <br/> int RHs ); // and INT <br/> const upint operator + (int lhs, // Add int and <br/> const upint & RHs ); // upint <br/> upint upi1, upi2; <br/>... <br/> upint upi3 = upi1 + upi2; // correct. No temporary object generated by upi1 or upi2 <br/> // <br/> upi3 = upi1 + 10; // correct, not by upi1 o R 10 <br/> // generated temporary object <br/> upi3 = 10 + upi2; // correct, no temporary object generated by 10 or upi2 <br/>.
Once you start to use function overloading to eliminate type conversion, you may declare the function and put yourself in danger:
Const upint operator + (int lhs, int RHs); // error!
In C ++, there is a rule that each overloaded operator must contain a user-defined type parameter.
However, the 80-20 rules must be kept in mind (refer to Clause 16 ). There is no need to implement a large number of overload functions, unless you have reason to make sure that the overall efficiency of the program will be significantly improved after using the overload function.
Item22. consider replacing the separate op operator with OP =:
Template <class T> <br/> const t operator + (const T & LHS, const T & RHs) <br/>{< br/> return T (LHS) + = RHS; // see the following discussion <br/>}</P> <p> template <class T> <br/> const t operator-(const T & LHS, const T & RHs) <br/>{< br/> return T (LHS)-= RHS; // See the discussion below <br/>}
First, in general, the assignment of operator is more efficient than its own form, because it returns a new object in the form of a single, so there is some overhead in the construction and release of temporary objects. The value assignment of operator writes the result to the parameter on the left. Therefore, you do not need to generate a temporary object to accommodate the operator return value.
2. Provide operator value assignment and its standard form, which allows class clients to make trade-offs in convenience and efficiency. That is to say, the client can decide to write it like this:
Rational A, B, C, D, result;
...
Result = A + B + C + D; // three temporary objects may be used. Each operator + call uses one
Write as follows:
Result = A; // No temporary object
Result + = B; // No temporary object
Result + = C; // No temporary object
Result + = D; // No temporary object
The former is easier to write, debug, and maintain, and its performance is acceptable for 80% of the time. The latter has higher efficiency, and it is estimated that this will be more intuitive for assembler programmers.
Finally, operator implementation is involved. For historical reasons, objects without names are easier to clear than objects with names. Therefore, it is better to use temporary objects when selecting between named objects and temporary objects. It makes you spend less time than naming objects, especially when using the old compiler.
Item23. consider using other equivalent libraries:
# Ifdef stdio <br/> # include <stdio. h> <br/> # else <br/> # include <iostream> <br/> # include <iomanip> <br/> using namespace STD; <br/> # endif </P> <p> const int values = 30000; // # of values to read/write </P> <p> int main () <br/>{< br/> double D; </P> <p> for (INT n = 1; n <= values; ++ N) {<br/> # ifdef stdio <br/> scanf ("% lf", & D); <br/> printf ("% 10.5f", d ); <br/> # else <br/> CIN> D; <br/> cout <SETW (10) // set the field width <br/> <setprecision (5) // set the decimal position <br/> <setiosflags (IOs: showpoint) // keep trailing 0 S <br/> <setiosflags (IOs: fixed) // use these Settings <br/> <D; <br/> # endif <br/> If (N % 5 = 0) {<br/> # ifdef stdio <br/> printf ("/N "); <br/> # else <br/> cout <'/N '; <br/># endif <br/>}< br/> return 0; <br/>}
When the natural logarithm of a positive integer is passed to this program, it will output the following output:
0.00000 0.69315 1.09861 1.38629 1.60944
1.79176 1.94591 2.07944 2.19722 2.30259
2.39790 2.48491 2.56495 2.63906 2.70805
2.77259 2.83321 2.89037 2.94444 2.99573
3.04452 3.09104 3.13549 3.17805 3.21888
Iostreams can also generate fixed-format I/O. Of course, it is far less convenient to input than printf ("% 10.5f", D.
But the operator <is both type-safe and extensible, while printf does not have these two advantages.
The purpose of this section is to provide different libraries with similar functions to take different performance trade-offs, so once you find the software bottleneck (by performing profile, see article 16 ), you should know whether it is possible to eliminate the bottleneck by replacing the library.
Item24. understand the overhead of virtual functions, multi-inheritance, virtual base classes, and rtti.
When calling a virtual function, the code to be executed must be consistent with the dynamic type of the object that calls the function. It is not important to point to the object pointer or referenced type. How can the compiler efficiently provide such behavior? Most compilers use virtual table and virtual table pointers. Virtual table and virtual table pointers are generally called vtbl and vptr respectively.
The first overhead required by a virtual function: You must leave room for the virtual talbe of each class containing the virtual function.
The second overhead required by a virtual function is: In each object that contains a virtual function class, you must pay a certain overhead for its object to store an additional pointer.
To find the address of a virtual function, the code generated by the compiler will do the following:
Find the vtbl of the class through the vptr of the object. Find the pointer to the called function in the corresponding vtbl.
Call the function pointed to by the pointer found in step 2. This is almost the same as calling non-virtual functions.
The real overhead of virtual functions at runtime is related to inline functions.In fact, virtual functions cannot be inline.This is because "inline" refers to "replacing the function call instruction with the called function itself during compilation, "But the" virtual "of a virtual function means that" You can only know which function to call at runtime." If the compiler does not know which function is called at the call point of a function, you can understand why it does not inline the function call.
This is the third overhead required by virtual functions: You must actually discard inline functions. (When a virtual function is called through an object, it can be inline, but most virtual functions are called through object pointers or references. Such calls cannot be inline. Because this call is a standard call method, virtual functions cannot be inline .)
In fact, the current compiler generally ignores the inline instructions of virtual functions.
The introduction of multi-inheritance makes things more complicated.
The following table summarizes the main overhead required by virtual functions, multi-inheritance, virtual base classes, and rtti:
Feature |
Increases Size of Objects |
Increases Per-class data |
Reduces Inlining |
Virtual Functions |
Yes |
Yes |
Yes |
Multiple inheritance |
Yes |
Yes |
No |
Virtual base classes |
Often |
Sometimes |
No |
Rtti |
No |
Yes |
No |
Remember that if you do not have the features provided by these features, you must manually encode them. In most cases, your manual simulation may be less efficient and more stable than the code generated by the compiler.