More effective C + +----(18) amortization also expected calculation

Source: Internet
Author: User
Tags diff prefetch

Item M18: Amortization also expected calculation
In the terms M17, I applaud the virtues of laziness and procrastinate as much as possible, and I explain how laziness improves the efficiency of the program. I will adopt a different attitude in this article. There will be no laziness here. I encourage you to let the program do more than is required, in this way to improve the performance of the software. The core of this clause is over-eager Evaluation (excessive enthusiasm calculation method): Finish them before asking you to do something. For example, the following template class is used to represent a collection of large numbers of data:

Template<class Numericaltype>class datacollection {public:  numericaltype min () const;  Numericaltype max () const;  Numericaltype avg () const;  ...};

Assuming that the Min,max and AVG functions return the minimum, maximum, and average values for this collection, there are three ways to implement these three functions. using eager evaluation (enthusiastic calculation method), when the Min,max and AVG functions are called, we detect all the values in the set and return an appropriate value. with lazy evaluation (lazy computing), only when the return value of a function is really needed do we ask the function to return a data structure that can be used to determine the exact value. using over-eager evaluation (over-enthusiastic calculation), we keep track of the minimum, maximum, and average values of the current set, so that when Min,max or AVG is called, we can return the correct value immediately without calculation. If Min,max and AVG are called frequently, we allocate the overhead of tracking collection minimums, maximums, and averages to all calls to these functions, each time a function call is allocated less overhead than eager evaluation or lazy evaluation.
the idea behind over-eager evaluation is that if you think that a calculation needs to be done frequently, you can design a data structure to efficiently handle these computational requirements, which can reduce the overhead of each calculation requirement.
The simplest way to adopt over-eager is to caching (cache) those values that have already been computed and may be needed later. For example, you write a program that provides information about an employee, and the frequently needed part of the information is the employee's office compartment number. The employee information is stored in the database, but the employee compartment number is irrelevant for most applications, so the database does not optimize for checking them. To prevent your program from burdening the database, you can write a function findcubiclenumber to cache the data you find. You can find it in the cache when you need the partition number that has been acquired, instead of querying the database. ( reading data into memory)
Here's a way to implement Findcubiclenumber: It uses the Map object in the Standard Template Library (STL) (see Terms M35 for STL).

int Findcubiclenumber (const string& employeename) {//define static map, store (employee name, cubicle number)//pairs. This map is L.  Ocal cache.  typedef map<string, int> Cubiclemap;   static Cubiclemap cubes;  Try to find a entry for employeename in the cache; The STL iterator "it" would then point to the found//entry, if there was one (see Item, details) cubiclemap::it   Erator it = Cubes.find (EmployeeName); "It" 's value would be cubes.end () if no entry were//found (this is the standard STL behavior).  If this is//the case, consult the database for the cubicle//number and then add it to the cache if (it = = Cubes.end ())     {int cubicle = The result of looking up EmployeeName's cubicle number in the database;           Cubes[employeename] = cubicle;                                             Add the pair//(EmployeeName, cubicle)  To the cache return cubicle; } else {//"it" points to the correct CAChe entry, which is a//(employee name, cubicle number) pair. We want only//the second component of this pair, and the member//"second" would give it to us return (*it). Sec  Ond }}

Do not fall into the implementation details of the STL code (you will be more clear after you have read the terms M35). Attention should be focused on the method that this function implies. This approach uses the local cache to replace expensive database queries with relatively inexpensive in-memory queries. If the compartment number is frequently required more than once, using the cache within Findcubiclenumber reduces the average cost of returning the compartment number.
(There is one detail in the above code that needs to be explained,The last statement returns the (*it). Second, rather than the usual it->second. Why? The answer is that this is to comply with the STL rules. Simply put, iterator is an object, not a pointer, so there is no guarantee that "-S" is correctly applied to it. but the STL requires "." and "*" are legal on iterator, so (*it). Second is more cumbersome in syntax, but guaranteed to work. )
catching is a method of apportioning the expected computational overhead. prefetching (pre-extraction) is another method. You can think of Prefech as a discount for buying large quantities of goods. For example, when disk controllers read data from disk, they read a whole block or entire sector of data, even if the program requires only a small piece of data. This is because reading a chunk of data at a time is faster than reading two or three small pieces of data at different times. and experience shows that if you need data for a place, you probably need the data next to it as well. This is a position-related phenomenon. Because of this phenomenon, the system designer has reason to use disk cache and memory cache for instructions and data, as well as use instruction prefetch.
You say you don't care about something as low as a disk controller or CPU cache. No problem, prefetch also has advantages in high-end applications. For example, if you implement a template for a dynamic array, dynamic is the array that starts with a certain size and can be automatically expanded later, so all non-negative indexes are valid:
Template<class t>                            //dynamic array class Dynarray {...};                      Template dynarray<double> A;                          At this point, only a[0]                                             //is a valid array element a[22] = 3.5;                                 A auto                                             -expansion//: Now index 0-22                                             //is legal a[32] = 0;                              have self-expanding;                                      Now A[0]-A[32] is legal

How does a Dynarray object expand itself when needed? A straightforward approach is to allocate the additional memory required. Just like this:
Template<class t>t& dynarray<t>::operator[] (int index) {  if (Index < 0) {    throw an exception;                     //Negative index is still illegal  }   if (Index > current maximum index value) {    Call new allocates enough extra memory to make the   index valid;  }   Returns the array element at the index position;}

Each time you need to increase the array length, this method calls new, but calling new will trigger operator new (see clause M8). the invocation of operator new (and operator delete) is usually expensive. Because they cause calls to the underlying operating system, system calls are generally slower than in-process function calls. So we should try to use less system calls.
Using the Over-eager evaluation method, the reason we now have to increase the size of the array to accommodate index I, then we may also increase the array size to accommodate other indexes larger than I in the future based on the positional correlation principle. To avoid a second (and expected) memory allocation for expansion, we now increase the size of the Dynarray to be larger than the size of I legal, and we hope that future extensions will be included in the scope we provide. For example, we can write this ( vector memory allocation mechanism should be based on this ...

Dynarray::operator[]:template<class t>t& dynarray<t>::operator[] (int index) {  if (Index < 0) Throw an exception;   if (Index > current maximum index value) {    int diff = index– The current maximum index value;     Call new to allocate enough extra memory to make the Index+diff legal;  }   Returns the array element at the index position;}

The memory allocated by this function is twice times the memory required for array expansion. If we look at the situation in front of you, we'll notice that Dynarray allocated an extra memory, even though its logical size was extended two times:
Dynarray<double> A;                          Only A[0] is legal a[22] = 3.5;                                 Call the new extension                                             //A's storage space to index//                                             A's logical size                                             //change to 23a[32] = 0;                                   Logical size of a                                             //changed, allow use of a[32],                                             //But no call to new

If you need to extend a again, the cost of the extension is modest as long as the new index provided is not greater than 44.
running through this article is a common theme, and faster speeds often consume more memory. The minimum, maximum, and average values of the runtime are tracked, which requires additional space, but saves time. The result of the cache operation requires more memory, but once the cached results are needed, the time to regenerate will be reduced. Prefetch needs space to place things that are prefetch, but it reduces the time it takes to access them. Since the computer has a description of this: You can change time with space. (However, not always, using a large object means that it is not suitable for virtual memory or cache pages.) In rare cases, building large objects reduces the performance of the software because of the increase in paging operations (see the Memory Management translator note in the operating system), the cache hit rate is reduced, or both occur simultaneously. How do you find that you are experiencing such a problem? You must profile, profiles, profile (see clause M16).
The proposal I made in this article, that is, to apportion the overhead of the expected calculation through the Over-eager method, such as caching and prefething, is not contradictory to the advice I made in the clause M17 on lazy evaluation. When you have to support certain operations without always needing their results, lazy evaluation is the technique used at this time to improve program efficiency. over-eager is a technique used at this time to improve program efficiency when you have to support certain operations and their results are almost always needed or more than once. The huge performance improvements they produce prove that it is worthwhile to spend some effort on this.


Summary: Space Exchange time (such as vector memory allocation mechanism!) )

More effective C + +----(18) amortization also expected calculation

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.