Item M17: Consider using lazy evaluation (lazy calculation method)
From the point of view of efficiency, the best calculation is not the calculation at all, OK, but if you do not have to calculate at all, why do you still add code at the beginning of the program to calculate it? And if you don't need to calculate, how do you have to execute the code?
The key is to be lazy.
Do you remember? When you were a child, your parents told you to tidy up the room. If you were like me, you would say "good" and then go on doing your own thing. You're not going to tidy up your own room. Tidying up your room in your mind is in the final position, and until you hear your parents come down to the hallway to see if your room has been sorted out, you'll rush into your room and start tidying up as fast as you can. If you are lucky, your parents may not check your room so that you don't have to tidy your room at all.
The same latency strategy also applies to the work of C + + programmers with five years of seniority.
in Computer science, we respectfully defer to the lazy evaluation (lazy calculation method). When you use lazy evaluation, classes that use this method defer the calculation until the system needs the results of these calculations. If no results are required, no calculations will be performed,Software customers are not as smart as your parents.
Maybe you want to know what I mean by what I'm saying. Perhaps give an example to help you understand. The lazy evaluation is widely used in a wide range of applications, so I'll tell you in four parts.
Reference count
Class String {...}; A String class (the standard //String type is implemented//as described below, but it //doesn ' t has to BE) String S1 = "Hello"; String s2 = S1; /Call String copy constructor
Typically a string copy constructor allows S2 to be initialized by S1, S1 and S2 have their own "Hello" copies.
This copy constructor can cause significant overhead, as it is necessary to make a copy of the S1 value and assign the value to S2, which usually involves allocating heap memory with the new operator (see clause 8), which calls for the strcpy function to copy the data from S1 to S2. This is a
Eager Evaluation (enthusiastic calculation): A copy of the S1 value is created and assigned to S2 only because it is a string copy constructor.
However, the S2 does not require a copy of this value, because S2 is not used.
Laziness can be less work. Instead of assigning S2 a s1 copy, let S2 share a value with S1. We only have to make some records to know who is sharing what, we can eliminate the cost of calling new and copying characters. In fact, S1 and S2 share a data structure that is transparent to the client, which is no different for the following example, because they simply read the data:
cout << S1; Read S1 values cout << s1 + s2; Read the values of S1 and S2
only when the value of this or that string is modified does the method of sharing the same value cause a difference. It is extremely important to modify only one string value, not two. For example, this statement:
S2.converttouppercase ();
This is essential only to modify the value of the S2, not even the value of the S1.
To execute the statement in this way, the Converttouppercase function of string should make a copy of the S2 value and assign the private value to S2 before the modification.
inside Converttouppercase, we can no longer be lazy: you must make a copy of the S2 (shared) value for S2 to use. On the other hand, if we don't modify S2, we don't have to make copies of its own values. Continue to keep shared values until the program exits. If we are lucky, S2 will not be modified, in which case we will never expend energy on the value assigned to it.
The implementation details of this shared value method (including all the code) are provided in clause M29, but the underlying principle is lazy evaluation: unless you really need to, don't make a copy of anything. We should be lazy as long as we can share the other values. In some applications, you can often do so.
Treat read and write differently
Continue discussing the Reference-counting string object above. Take a look at the second method of using lazy evaluation. Consider a code like this:
String s = "Homer ' s Iliad"; Suppose it is a //reference-counted string ... cout << s[3]; Call operator[] Read S[3]s[3] = ' x '; Call operator[] Write s[3]
The first call to operator[] is used to read the partial value of a string, but the function is called the second time to complete the write operation. We should be able to discriminate between read calls and write calls because it is easy to read the reference-counted string, and
writing to this string requires a new copy of the string value to be made before writing.
We are in a difficult mess. In order to be able to do this, you need to take a different step in the operator[] (call this function in order to complete the read operation or call the function to complete the write operation). How do we tell if the context of call operator[] is read or write? The cruel truth is that we cannot judge it. By using the lazy evaluation and the terms described in M30
proxy class, we can postpone the decision to read or write until we can determine the correct answer.
Lazy fetching (lazy extract)The third lazy evaluation example assumes that your program uses a number of large objects that contain many fields.
These objects have a lifetime beyond the program's run time, so they must be stored in the database. Each pair has a unique object identifier that is used to re-obtain the object from the database:
Class LargeObject { //Large Persistent object public: LargeObject (ObjectID ID); Restore objects from disk const string& field1 () const; Field 1 value int field2 () const; Field 2 value double field3 () const; // ... Const string& FIELD4 () const; Const string& FIELD5 () const; ... };
Now consider the cost of recovering largeobject from disk:
void Restoreandprocessobject (ObjectID id) { LargeObject object (ID); Recover objects ... }
because the Largeobject object instance is large and gets all the data for such an object, the operation of the database is expensive, especially if the data is fetched from the remote database and the data is sent over the network. In this case, you do not need to read all the data. For example, consider a program that:
void Restoreandprocessobject (ObjectID id) { LargeObject object (ID); if (object.field2 () = = 0) { cout << "object" << ID << ": null field2.\n"; }}
This only requires a FILED2 value, so the effort to get the other fields is wasted.
when the Largeobject object is built, all data is not read from the disk, so the lazy method solves the problem.
However, only one object "Shell" is created, and when a certain data is needed, the data is retrieved from the database .。 The implementation of this "demand-paged" Object initialization is: (
C + + Primer in version fifth: We want to be able to modify a data member of a class, even if it is a const member function, by adding the mutable keyword to the declaration of the variable)
Class LargeObject {public: LargeObject (ObjectID ID); Const string& field1 () const; int field2 () const; Double field3 () const; Const string& FIELD4 () const; .. private: ObjectID oid; mutable string *field1value; See below for mutable int *field2value; Discussion of "mutable" mutable double *field3value; mutable string *field4value; ... }; Largeobject::largeobject (ObjectID ID): OID (ID), Field1value (0), Field2value (0), Field3value (0), ... {} const string& largeobject::field1 () const{ if (Field1value = = 0) { reads data from the database for filed 1, making Field1value points to this value; } return *field1value;}
each field in the object is represented by a pointer to the data, and the Largeobject constructor initializes each pointer to null. These null pointers indicate that the field has not yet read the values from the database. Each Largeobject member function must check the state of the field pointer before accessing the data pointed to by the field pointer. If the pointer is empty, the corresponding data must be read from the database before the data is manipulated.
When implementing lazy fetching, you have a problem: it is possible to initialize a null pointer in any member function to point to the actual data, including in the const member function, such as field1. However, when you try to modify the data in the const member function, the compiler will have a problem.
the best approach is to declare that the field pointer is mutable, which means that they can be modified in any function, even in the const member function(see effective C + + clause 21). This is why the field is declared as mutable in Largeobject.
The keyword Mutalbe is a relatively new C + + feature, so the compiler you use may not support it. If so, you need to find another way for the compiler to allow you to modify the data members in the Const member function. A method called
"Fake This" (forged this pointer), you create a pointer to the non-const that points to the same object as the this pointer. When you want to modify a data member, you access it through "fake This": (
The essence is to remove the const attribute with const_cast<type>)
Class LargeObject {public: const string& field1 () const; No change ... private: string *field1value; Not declared as mutable ... Because the old compiler does not}; Support It const string& largeobject::field1 () const{ //Declaration pointer, fakethis, which points to the same object as this //but has removed the object's constant property LargeObject * Const Fakethis = const_cast<largeobject* const> (this); if (Field1value = = 0) { Fakethis->field1value = //This assignment is correct, the appropriate data // Because Fakethis points to the from the database; object is not const } return *field1value;}
This function uses the const_cast (see clause 2) to remove the const attribute of the *this. If your compiler does not support Cosnt_cast, you can use the old-fashioned C-style cast:
Use the old cast to mimic mutableconst string& largeobject::field1 () const{ LargeObject * Const Fakethis = (largeobject* const) this; ... As above}
looking at the pointers in Largeobject, you have to initialize these pointers to NULL, and then you have to test them every time you use them, which is annoying and prone to errors. Fortunately, the use of smart pointers can be used to automate this chore, as described in article M28. If you use smart pointers in Largeobject, you will also find that you no longer need to declare pointers with Mutalbe. This is only temporary, because when you implement the Smart pointer class, you end up bumping into Mutalbe.
Lazy expression Evaluation (lazy expressions calculation)
The last example of lazy evaluation comes from a digital program. Consider a code like this:
Template<class t>class Matrix {...}; For homogeneous matrices matrix<int> M1 (+); A matrix of 1000 * 1000 matrix<int> m2 (; Ditto... matrix<int> m3 = m1 + m2; M1+m2
Typically operator implementations use Eagar evaluation: In this case, it calculates and returns M1 and M2. This computation is quite large (1 million addition operations), and of course the system allocates memory to store the values.
The lazy evaluation method says it's too much work, so don't do it. Instead, a data structure should be created to indicate that the value of M3 is M1 and M2, and that an enum indicates that they are additive operations. Obviously, building this data structure is much faster than M1 and M2, and can save a lot of memory.
Consider this part of the program, before using M3, the code executes as follows:
Matrix<int> M4 (1000, 1000); ... Assign to m4 some values M3 = M4 * M1;
Now we can forget that M3 is M1 and M2 and (thus saving the overhead of computation), and here we should remember that M3 is the result of M4 and M1 operations. Needless to say, we don't have to do multiplication. Because we're lazy, remember?
This example seems contrived, because a good programmer would not write a program like this: Calculate the two matrices and not use them, but it is not actually as contrived as it looks.
While a good programmer does not perform unwanted computations, it is common for programmers to modify the path of the program in maintenance so that previously useful computations become out of effect. It is possible to
reduce the likelihood of this happening by defining objects that are evaluated before they are used (see effective C + + clause 32), but this problem occasionally still occurs.
But if this is the only time to use lazy evaluation, it's not worth it. A more common area of application is when we just need to calculate the part of the result. For example, suppose we initialize the value of M3 to M1 and M2, and then use M3 like this:
cout << m3[4]; Print line Fourth of M3
Obviously, we can no longer be lazy, we should calculate the fourth line value of M3. But we cannot be too ambitious, we have no reason to calculate the results of M3 fourth, and the remainder of M3 still remain in the non-calculated state until they are really needed. Luckily, we didn't have to.
How could we be so lucky? Experience in the field of matrix computing suggests that this is a big possibility. In fact, the lazy evaluation exists in the APL language. APL was developed in the 1960 's language and is able to perform interactive matrix-based operations. At that time, the computing power of the computer running it is not now high in the microwave oven, the APL surface can be added to the matrix, multiply, and even can quickly divide with the large matrix! The trick is lazy evaluation. This technique is usually effective because users of the general APL add, multiply, or divide the matrix not because they need the value of the entire matrix, but simply need a small fraction of their value. The APL uses lazy evaluation to delay their calculations until they know exactly what part of the matrix is needed, and then just calculate this part. In fact, this allows the user to perform a large number of computations interactively on a computer that cannot complete eager evaluation at all.
Now the computer is fast, but the data set is larger, the user is also impatient, so many of the current matrix library programs still use lazy evaluation.
To be fair, laziness sometimes fails. If you use M3 in this way:
cout << m3; Print all values of M3
Everything is over, we must calculate all the values of the M3.
Similarly, if you modify any of the matrices on which M3 depends, we must also calculate immediately:
M3 = m1 + m2; Remember that M3 is M1 with M2 and //m1 = M4; Now M3 is the sum of the old values of M2 and M1!
Here we must take steps to ensure that the assignment to M1 will not change M3. In the matrix<int> assignment operator, we can capture the value of M3 before changing the M1, or we can make a copy of the old value of M1 to make the M3 dependent on the copy calculation, and we must take steps to ensure that the M1 value remains the same after it is assigned. Other functions that may modify the matrix must be handled in the same way.
because of the need to store a dependency between two values, maintaining stored values, dependencies, or both, overloaded operators such as evaluators, copy operations, and addition operations, lazy evaluation is used in many areas of the digital realm. On the other hand, it often saves a lot of time and space when running programs.
SummarizeThe above four examples show that the lazy evaluation is useful in every field:Avoid unwanted
copy of objects by using operator[] to differentiate read operations, avoid unnecessary database reads, and avoid unwanted digital operations. But it's not always useful. It's like if your parents are always checking your room, delaying the housekeeping will not reduce your workload. In fact, if your calculations are important, lazy evaluation may slow down and increase the use of memory because, in addition to all calculations, you must maintain the data structure so that the lazy evaluation runs as fast as possible. The lazy evaluation is useful in some cases where the software is required to perform the previously avoidable calculations.
Lazy evaluation doesn't have anything special for C + +. This technique can be used in a variety of languages, and several languages such as the famous APL, dialects of Lisp (in fact all data flow languages) have made this thought an essential part of the language. However, mainstream programming language uses eager evaluation,c++ is the mainstream language. However, C + + is particularly well-suited for users to implement lazy evaluation, because its support for encapsulation makes it possible to add a lazy evaluation to a class without having to let the user of the class know.
Taking a look at the code snippet in the example above, you can see whether the eager or lazy evaluation is not the slightest difference in the interface provided by the class. This means that we can implement a class directly using the eager evaluation method, but if you use the Profiler survey (see clause M16) to show that the class implementation has a performance bottleneck, you can use the lazy Evaluation class implementation to replace it (see effective C + + clause 34). For the user, the change is only a performance improvement (after recompiling and linking). This is the way users like the software upgrade, it makes you can completely proud of laziness.
My summary: Use the constructor to initialize the member variable pointer to 0, using a smart pointer.
More effective C + +----(17) Consider using lazy evaluation (lazy calculation method)