/****************************************************************/
/* Learning is a collaborative and shared style!
/* Author:atlas Email:[email protected]
* * Reprint Please indicate the source of this article:
* http://blog.csdn.net/wdzxl198/article/details/9178099
/****************************************************************/
Previous Period content review:
C + + Memory Management Learning notes (6)
3 Memory Leak 3.1 introduced-solution for a memory leak 3.2 how to deal with memory leaks 3.3 on C + + memory leak and detection tools
3.3.3 detection of internal rough leak tool introduction
In the article, "C + + memory leak and detection", the author simply introduces the next memory detection under Windows and Linux. Here I refer to the general content to increase the understanding of the memory leaks section of memory management. At present, the author is mainly in Linux under some development and application, the main explanation in the Linux platform under the way, in fact, the different platforms under the same principle of different forms.
First, let's look at a simple memory leak example, although the end of the function is freed, and if the while (1), the memory runs out
1: #include <stdlib.h>
2: #include <iostream>
4:USINGNAMESPACESTD;
6:void getmemory (char*p, Intnum)
7: {
8: p = (char*) malloc (sizeof (char) * num);//Can be detected using new
9:}
10:int Main (intargc,char** argv)
11: {
: char *str = NULL;
: getmemory (str, 100);
: cout<< "Memory leak test!" <<endl;
: //If there is a while loop in main, call GetMemory
: //Then the problem will become very serious
: //while (1) {getmemory (...);}
: return 0;
19:}
Linux platform, the rationale is as follows: to allocate memory and free memory when the record, the program at the end of the comparison between allocating memory and releasing memory to determine whether there is a memory leak. Here the author recommends Linux under the Mtrace tool, as well as another very powerful tool Valgrind,
As shown, you know:
==6118== bytes in 1 blocks is definitely lost in loss record 1 of 1
==6118== at 0x4024f20:malloc (vg_replace_malloc.c:236)
==6118== by 0x8048724:getmemory (char*, int.) (In/home/netsky/workspace/a.out)
==6118== by 0x804874e:main (in/home/netsky/workspace/a.out)
Is the memory leak caused by getmemory in main, which is called by malloc, causing a leak of 100 bytes of memory.
things to notice:
There was a lot of information in each error message; Read it carefully.
The RST line ("Heap Summary") tells your what kind of error it is.
below the RST line is a stack trace telling where the problem occurred. S Tack traces can get quite large, and is confusing, especially if you are using the C + + STL. Reading them from the bottom up can help.
The code addresses (eg. 0x4024f20) is usually unimportant, but occasionally crucial f Or tracking down weirder bugs. The stack trace tells you where the leaked memory is allocated. Memcheck cannot tell the memory leaked, unfortunately. (Ignore the "vg_replace_malloc.c", that's an implementation detail.) There is several kinds of leaks; The important categories is:
"de?nitely lost": Your program is Leaki ng memory--? x it!
them to point to the M Iddle of a heap block)
In addition, some other tools, such as Microsoft's memory leak Check tool, I have not used, see the introduction on the official website, feel good, Visual Leak Detector is a free, robust, open-source memory Leak Detection system for Visual C + +. (Visual Leak Detector for Visual C + + 2008/2010/2012), his code project in Http://www.codeprojec T.com/articles/9815/visual-leak-detector-enhanced-memory-leak-detectio;
3.4 About Memory recycling this thing
Memory recycling is an effective mechanism for ensuring that memory is not compromised.
(1) Analysis of three kinds of memory objects
The advantage of the stack object is automatically generated at the appropriate time, and automatically destroyed at the appropriate time, do not need the programmer to worry about, and the stack object is usually created faster than the heap object, because when the heap object is allocated, the operator new operation is called, operator new uses some kind of memory space search algorithm, While the search process can be time-consuming, the resulting stack object is not so troublesome, it just needs to move the top pointer on the stack. However, it is important to note that the size of the stack space is generally small, generally 1mb~2mb, so the larger objects are not suitable for allocation in the stack. It is important to note that the recursive function is best not to use the Stack object, because as the recursive call depth increases, the required stack space will also increase linearly, when the required stack space is not enough, it will cause the stack overflow, resulting in a run-time error.
The heap object, which is generated and destroyed at all times, is precisely defined by the programmer, which means that the programmer has complete control over the life of the heap object. We often need such objects, for example, we need to create an object that can be accessed by multiple functions, but do not want to make it global, then it is a good choice to create a heap object at this time, and then pass the pointer of this heap object between each function, can realize the sharing of this object. In addition, the heap capacity is much larger than the stack space. In fact, when there is not enough physical memory, if you need to generate a new heap object at this point, you usually do not produce a run-time error, but the system uses virtual memory to extend the actual physical memory.
Next look at the static object:
The first is the global object. Global objects provide the simplest way for inter-class communication and inter-function communication, although this approach is not elegant. In general, in a complete object-oriented language, there is no global object, such as C #, because global objects mean unsafe and high coupling, too much use of global objects in programs will greatly reduce program robustness, stability, maintainability, and reusability. C + + can also completely eliminate the global object, but in the end, I think one of the reasons is to be compatible with C.
Second is the static member of the class, as mentioned above, all objects of the base class and its derived classes share this static member object, so static members are a good choice when it is necessary to share or communicate data between these classes or between these class objects.
Then there is a static local object, which can be used to hold the intermediate state of the function that the object is in, and one of the most notable examples is the recursive function, and we all know that the recursive function calls its own function, if a nonstatic local object is defined in the recursive function, So when the number of recursion is quite large, the overhead is huge. This is because the nonstatic local object is a stack object, and each recursive invocation produces an object that, when returned, releases the object, and that such objects are confined to the current call layer and are not visible to the deeper nesting layers and the more bluntly outer layers. Each layer has its own local objects and parameters.
In a recursive function design, you can use a static object instead of the nonstatic local object (that is, the Stack object), which not only reduces the overhead of generating and releasing the Nonstatic object per recursive call and return, but also the static object can hold the middle state of the recursive call. and can be accessed by each call layer.
(2) Several basic methods of garbage collection
In this part, the author puts forward several methods of garbage collection in the article of "several classical garbage collection algorithms in C + +", which are application counting algorithm, Mark-clear method, marker-contraction method.
1. Reference counting algorithm--use of smart pointers
The reference count (Reference counting) algorithm is the number of pointers that each object calculates to point to it, plus 1 when a pointer points to itself, and when you delete a pointer to yourself, the value is reduced by 1, and if the count value is reduced to 0, there is no pointer to the object. So it can be safely destroyed.
The advantage of the reference counting algorithm is that the overhead of memory management is very "smooth" during the entire application run, eliminating the need to suspend the running of the application for garbage collection, and its other advantage is that the locality of reference in the space is better, when the reference count value of an object becomes 0 o'clock, The system does not need to access the cells that are located in the other pages of the heap, and the following garbage collection algorithms we are going to see are going back to all the surviving units before recycling, which may cause a paging (Paging) operation, and the last reference counting algorithm provides a way to stack allocations that are discarded, Some of the garbage collection algorithms we are going to see later will survive for a period of time after the object is discarded before being recycled.
The reference counting algorithm has many advantages, but its disadvantage is also very obvious. The first thing you can see is the overhead of the time, each time the object is created or released, the reference count value is computed, which raises some additional overhead, and the second is the overhead of space, because each object has to maintain its own number of references, it has to pay extra space to hold the reference count value The biggest drawback of the reference counting algorithm is that it cannot handle ring references, and if the two objects in the loop are neither unreachable nor recyclable, because they are referenced to each other and their respective count values are not 0, this situation is powerless for the reference counting algorithm, while other garbage collection algorithms can handle ring references well.
The most famous use of the reference counting algorithm is Microsoft's COM technology, the famous IUnknown interface:
1:interface IUnknown
2: {
3: virtual HRESULT _stdcall QueryInterface
4: (const iid& IID, void* * PPV) = 0;
5: Virtual ULONG _stdcall AddRef () = 0;
6: virtual ULONG _stdcall Release () = 0;
The AddRef and release are used to let the component manage its life cycle, and the client program only cares about the interface, without having to care about the component's life cycle, a simple example of use is as follows:
3:
4:
5:
6:
7:
8:
9:
10:
11:
13:
The above client program has already called AddRef in CreateInstance, so there is no need to call again, and release is called after the interface is used, so that the count value that the component maintains itself will change. The following code gives a simple example of implementing AddRef and release:
3:
8:
9:
10:
11:
12:
13:
In the programming language Python, the use is also the reference counting algorithm, when the object reference count value is 0 o'clock, will call the __DEL__ function, as for why Python to choose the reference counting algorithm, according to an article I read that, because Python as a scripting language, often to These languages interact, while using the reference counting algorithm avoids changing the position of the object in memory, and Python introduces a GC module in order to solve the circular reference problem, so essentially Python's GC scheme is a mix of reference counting and tracking (the next three algorithms) for two garbage collection mechanisms.
2. Tag-Clear algorithm
The tag-purge (Mark-sweep) algorithm relies on a global traversal of all surviving objects to determine which objects can be recycled, and the traversal process starts from the root to find all the objects that can be reached, except that other unreachable objects are garbage objects that can be recycled. The entire process is divided into two stages: the tagging phase finds all surviving objects, and the purge phase clears all junk objects.
Marking phase
Purge phase
Compared to the reference counting algorithm, the tag-purge algorithm can handle ring-referencing problems very naturally, and the cost of manipulating reference-count values is reduced when objects are created and objects are destroyed. Its disadvantage is that the tag-purge algorithm is a stop-start algorithm that the application must temporarily stop during the garbage collector run, so the study of the tag-purge algorithm reduces its pause time, and the generational garbage collector is meant to reduce its pause time, which is said later. In addition, the tag-purge algorithm needs to traverse all the surviving objects during the tagging phase, which can cause a certain overhead, resulting in a large amount of memory fragmentation when the garbage object is purged during the purge phase.
3. Mark-and-Zoom algorithm
The mark-and-shrink algorithm is an algorithm that is created to solve the problem of memory fragmentation. Its entire process can be described as: marking all surviving objects, shrinking the object graph by re-adjusting the position of the surviving object, and updating the pointer to the object being moved.
Marking phase:
Purge phase:
The most difficult point of the tag-compression algorithm is how to choose the compression algorithm used, if the compression algorithm selection is not good, it will lead to great program performance problems, such as the result of low cache hit rate. In general, depending on the location of the compressed object, the compression algorithm can be divided into the following three kinds:
1. Arbitrary: Move objects without regard to their original order, and regardless of whether there is a cross-referencing relationship between them.
2. Linear: As far as possible the original object and the object it points to the adjacent position, so as to achieve better spatial locality.
3. Swipe: "Slide" the object to one end of the heap, "extrude" the free elements between the surviving objects, thus maintaining the original order of the allocations.
4. Node Copy algorithm
The node copy algorithm divides the entire heap into two halves (from,to), and the GC process is actually the process of copying the surviving objects from one half to the other, while in the next collection, the two halves swap the roles. After the move is over, update the object's pointer reference before the GC starts:
When the GC ends:
Node copy algorithm because in the copy process, the memory can be collated, so there is no memory fragmentation problem, but also do not need to do a special memory compression. , and its biggest drawback is the need for double space.
This article introduces a total of four classic garbage collection algorithms, of which three are often referred to as tracking garbage collection, because the reference counting algorithm can be a smooth garbage collection, without a "stop" phenomenon, often appear in some real-time systems, but it does not solve the annular problem, and based on the tracking garbage collection mechanism, In each garbage collection process, to traverse or replicate all the surviving objects, this is a very time-consuming work, a good solution is to partition the objects on the heap, the different areas of the object using a different garbage collection algorithm, the sub-generational garbage collector is one of them, The generational garbage collection mechanism is used in both the CLR and the JVM, but they are somewhat different in processing.
(3) Real garbage collection (garbage collector)-"re-discussion of C + + garbage collection (GC)"
- C + + does not support GC. GC is an optional component of C + + and is not required.
- Here is the GC-related code and documentation for C + +:
- C + +/CLI (managed C + + for Microsoft)
- http://www.hpl.hp.com/personal/Hans_Boehm/gc/(Many people use this GC to develop)
- http://www.devarticles.com/c/a/Cplusplus/A-Simple-Garbage-Collector-for-C-plus-plus/
- http://cybertiggyr.com/gene/garbage-collection-cpp/
- Evaluation: If your system is simple, it's a good idea to use this. Because this means that your C + + language is no different from Java or C #. But please note that this requires your system to be pure, that is to say:
- You are not using third-party code. Or, in your third-party code, Memory is also managed (created by GC).
- Be careful to deal with DLLs that you can't get source code from (for example, Win32 api/the third-party components you purchase). Like C #, you need to be careful not to let your program appear Win32 the API is still accessing the data while being recycled by GC. But this is much better than C # in C + +: After all, in C + +, you have a much greater control over GC than C #.
- In particular, if you use COM, you will be miserable: Because COM manages the object lifecycle based on reference counting, this means that the GC is largely powerless. Unless you are like C #, provide a wrapper for each COM component.
Add:
- Because the GC relies on language self-introspection (reflection) capabilities, and C + + 's ability is undoubtedly quite weak, GC in C + + is awkward.
- Note that the GC is exclusive, which means that two GC cannot work together. Therefore, unless the C + + standard Specifies that the GC must be used, otherwise managed C + + code has difficulty communicating.
Refer to the "C + + Memory Management Learning program"
In recent days busy research projects, so did not send up in time to study notes. This is the last article of the note, but not limited to this.
Edit by Atlas,
TIME:2013/6/20 09:35
C + + Memory Management Learning notes (7)