Memory leak detection and performance analysis using Valgrind tools under Linux

Source: Internet
Author: User
Tags valgrind python script

From http://www.linuxidc.com/Linux/2012-06/63754.htm

Valgrind is commonly used to analyze program performance and memory leak errors in programs

A Valgrind toolset Company

Valgrind contains the following tools:

1, Memcheck: Check the program of memory problems, such as leakage, cross-border, illegal pointers and so on.

2, Callgrind: Check the program code run time and call process, and analysis program performance.

3, Cachegrind: Analyze the CPU cache hit ratio, loss rate, for code optimization.

4. Helgrind: A race condition for checking multi-threaded applications.

5. Massif: Stack analyzer, which indicates how much heap memory is used in the program.

6, Lackey:

7, Nulgrind:

The use of these tools is by command: Valgrand--tool=name program name is called separately, when the tool parameter is not specified by default is--tool=memcheck

Two Valgrind Tools detailed

1.Memcheck

The most common tool used to detect memory problems in the program, all memory reads and writes are detected, and all calls to malloc, free, new, delete are captured. Therefore, it can detect the following problems:

1, the use of uninitialized memory;

2, read/write the memory block after release;

3, read/write the memory block allocated by malloc;

4, read/write inappropriate memory blocks in the stack;

5, memory leak, pointing to a piece of memory pointer is forever lost;

6, incorrect malloc/free or new/delete match;

7. The DST and src pointers overlap in the memcpy () correlation function.

These problems are often the most vexing problem for C + + programmers, and Memcheck can be a big help here.
For example:

#include <stdlib.h>
#include <malloc.h>
#include <string.h>

void Test ()
{
int *ptr = malloc (sizeof (int) *10);

PTR[10] = 7; Memory out of Bounds

memcpy (ptr +1, PTR, 5); Stepping on memory


Free (PTR);
Free (PTR);//re-release

int *p1;
*P1 = 1; Illegal pointers
}

int main (void)
{
Test ();
return 0;
}
After compiling the program to generate the executable file, execute: Valgrind--leak-check=full./Program Name
The output results are as follows:

==4832== Memcheck, a memory error detector
==4832== Copyright (C) 2002-2010, and GNU GPL ' d, by Julian Seward et al.
==4832== Using Valgrind-3.6.1 and Libvex; Rerun with-h for copyright info
==4832== Command:./tmp
==4832==
==4832== Invalid Write of size 4//memory out of bounds
==4832== at 0x804843f:test (in/home/yanghao/desktop/testc/testmem/tmp)
==4832== by 0x804848d:main (in/home/yanghao/desktop/testc/testmem/tmp)
==4832== Address 0x41a6050 is 0 bytes after a block of size alloc ' d
==4832== at 0x4026864:malloc (vg_replace_malloc.c:236)
==4832== by 0x8048435:test (in/home/yanghao/desktop/testc/testmem/tmp)
==4832== by 0x804848d:main (in/home/yanghao/desktop/testc/testmem/tmp)
==4832==
==4832== Source and destination overlap in memcpy (0x41a602c, 0x41a6028, 5)//Step on memory
==4832== at 0x4027bd6:memcpy (mc_replace_strmem.c:635)
==4832== by 0x8048461:test (in/home/yanghao/desktop/testc/testmem/tmp)
==4832== by 0x804848d:main (in/home/yanghao/desktop/testc/testmem/tmp)
==4832==
==4832== Invalid Free ()/delete/delete[]//Repeat Release
==4832== at 0x4025bf0:free (vg_replace_malloc.c:366)
==4832== by 0x8048477:test (in/home/yanghao/desktop/testc/testmem/tmp)
==4832== by 0x804848d:main (in/home/yanghao/desktop/testc/testmem/tmp)
==4832== Address 0x41a6028 is 0 bytes inside a block of size + free ' d
==4832== at 0x4025bf0:free (vg_replace_malloc.c:366)
==4832== by 0x804846c:test (in/home/yanghao/desktop/testc/testmem/tmp)
==4832== by 0x804848d:main (in/home/yanghao/desktop/testc/testmem/tmp)
==4832==
==4832== use of uninitialised value of size 4//illegal pointer
==4832== at 0x804847b:test (in/home/yanghao/desktop/testc/testmem/tmp)
==4832== by 0x804848d:main (in/home/yanghao/desktop/testc/testmem/tmp)
==4832==
==4832==
==4832== process terminating with default action of signal (SIGSEGV)//program crashes due to illegal pointer assignment
==4832== bad permissions for mapped in address 0x419fff4
==4832== at 0x804847b:test (in/home/yanghao/desktop/testc/testmem/tmp)
==4832== by 0x804848d:main (in/home/yanghao/desktop/testc/testmem/tmp)
==4832==
==4832== HEAP SUMMARY:
==4832== in with exit:0 bytes in 0 blocks
==4832== Total Heap Usage:1 Allocs, 2 frees, Bytes allocated
==4832==
==4832== all heap blocks were freed--no leaks is possible
==4832==
==4832== for counts of detected and suppressed errors, rerun with:-V
==4832== use--track-origins=yes to see where uninitialised values come from
==4832== ERROR Summary:4 errors from 4 contexts (suppressed:11 from 6)
Segmentation fault
From the test output of valgrind, these errors have been found out.

2.Callgrind

and Gprof similar analysis tools, but it to the operation of the program observation is more nuanced, can provide us with more information. Unlike gprof, it does not require additional special options when compiling source code, but it is recommended to add debugging options. Callgrind collects some data from the program runtime, establishes a function call graph, and optionally carries out a cache simulation. At the end of the run, it writes the parsing data to a file. Callgrind_annotate can convert the contents of this file into a readable form.

Generate visual graphics that need to be downloaded gprof2dot:http://http://jrfonseca.googlecode.com/svn/trunk/gprof2dot/gprof2dot.py

This is a Python script, download it after you modify its permissions chmod +7 gprof2dot.py, and add this script to any folder in the $path path, I put it in the/usr/bin directory, This allows the gprof2dot.py to be executed directly under the terminal.

Callgrind can generate a graph of program performance analysis, first of all, the tool for program performance analysis, you can usually use the GNU gprof, it is used: when compiling the program, add-PG parameters, for example:

#include <stdio.h>
#include <malloc.h>
void Test ()
{
Sleep (1);
}
void F ()
{
int i;
for (i = 0; i < 5; i + +)
Test ();
}
int main ()
{
f ();
printf ("process is over!\n");
return 0;
}
First execute the Gcc-pg-o tmp tmp.c, and then run the program./tmp, after the program runs, the Gmon.out file is generated in the current directory (this file gprof required for the parser),
Re-execute gprof./tmp | gprof2dot.py |dot-tpng-o report.png, open report.png Result:


Show Test was called 5 times, and the most time-consuming percentage of the program was the test function.

Then take a look at the Callgrind generation call graph process, execute: Valgrind--tool=callgrind./tmp, after execution completes the "Callgrind.out.XXX" file in the directory this is the analysis file, Can be directly used: callgrind_annotate Callgrind.out.XXX printing results, can also be used: Gprof2dot.py-f callgrind Callgrind.out.XXX |dot-tpng-o Report.png to generate graphical results:

It produces very detailed results, and even the function entry, and library function calls are identified.

3.Cachegrind

The cache parser, which simulates the first-level cache i1,dl and level two caches in the CPU, can pinpoint the loss and hit of the cache in the program. It can also provide us with the number of cache misses, memory references, and each line of code, each function, each module, and the entire program, if needed. This is a great help in optimizing the program.

Advertise:Valgrind itself uses the tool to improve performance by 25%-30% over the last few months . According to previous reports,KDE's development team also thanked Valgrind for its help in improving KDE performance.

It is also used in the following ways: Valgrind--tool=cachegrind program name,

4.Helgrind

It is primarily used to check for competition issues that occur in multithreaded programs. Helgrind looks for areas in memory that are accessed by multiple threads without a consistent lock, which are often places where threads are out of sync and can cause hard-to-dig errors. Helgrind implements a competitive detection algorithm called "Eraser", and makes further improvements to reduce the number of reported errors. However, Helgrind is still in the experimental stage.

Let's start with an example of race:

  1. #include <stdio.h>
  2. #include <pthread.h>
  3. #define NLOOP 50
  4. int counter = 0; /* Incremented by threads */
  5. void *threadfn (void *);
  6. int main (int argc, char **argv)
  7. {
  8. pthread_t Tid1, TID2,TID3;
  9. Pthread_create (&TID1, NULL, &THREADFN, NULL);
  10. Pthread_create (&tid2, NULL, &THREADFN, NULL);
  11. Pthread_create (&TID3, NULL, &THREADFN, NULL);
  12. /* Wait for both threads to terminate */
  13. Pthread_join (TID1, NULL);
  14. Pthread_join (Tid2, NULL);
  15. Pthread_join (TID3, NULL);
  16. return 0;
  17. }
  18. void *threadfn (void *vptr)
  19. {
  20. int I, Val;
  21. for (i = 0; i < Nloop; i++) {
  22. val = counter;
  23. printf ("%x:%d \ n", (unsigned int) pthread_self (), val+1);
  24. counter = val+1;
  25. }
  26. return NULL;
  27. }

The race of this program in the 30~32 line, we want to effect is 3 threads to the global variable cumulative 50 times, the last global variable value is 150, because there is no lock, it is obvious that the race so that the program can not achieve our goal. Let's see how Helgrind helps us detect race. Compile the program first: Gcc-o Test Thread.c-lpthread, and then execute: Valgrind--tool=helgrind./test The results are as follows:

49c0b70:1
49c0b70:2
==4666== Thread #3 was created
==4666== at 0x412e9d8:clone (clone. s:111)
==4666== by 0x40494b5: [email protected] @GLIBC_2.1 (createthread.c:256)
==4666== by 0x4026e2d:pthread_create_wrk (hg_intercepts.c:257)
==4666== by 0x4026f8b: [Email protected]* (hg_intercepts.c:288)
==4666== by 0x8048524:main (in/home/yanghao/desktop/testc/testmem/a.out)
==4666==
==4666== Thread #2 was created
==4666== at 0x412e9d8:clone (clone. s:111)
==4666== by 0x40494b5: [email protected] @GLIBC_2.1 (createthread.c:256)
==4666== by 0x4026e2d:pthread_create_wrk (hg_intercepts.c:257)
==4666== by 0x4026f8b: [Email protected]* (hg_intercepts.c:288)
==4666== by 0x8048500:main (in/home/yanghao/desktop/testc/testmem/a.out)
==4666==
==4666== Possible Data Race during read of size 4 at 0x804a028 by thread #3
==4666== at 0X804859C:THREADFN (in/home/yanghao/desktop/testc/testmem/a.out)
==4666== by 0x4026f60:mythread_wrapper (hg_intercepts.c:221)
==4666== by 0x4048e98:start_thread (pthread_create.c:304)
==4666== by 0x412e9ed:clone (clone. s:130)
==4666== This conflicts with a previous write of size 4 by thread #2
==4666== at 0X80485CA:THREADFN (in/home/yanghao/desktop/testc/testmem/a.out)
==4666== by 0x4026f60:mythread_wrapper (hg_intercepts.c:221)
==4666== by 0x4048e98:start_thread (pthread_create.c:304)
==4666== by 0x412e9ed:clone (clone. s:130)
==4666==
==4666== Possible Data Race during write of size 4 at 0x804a028 by thread #2
==4666== at 0X80485CA:THREADFN (in/home/yanghao/desktop/testc/testmem/a.out)
==4666== by 0x4026f60:mythread_wrapper (hg_intercepts.c:221)
==4666== by 0x4048e98:start_thread (pthread_create.c:304)
==4666== by 0x412e9ed:clone (clone. s:130)
==4666== This conflicts with a previous read of size 4 by thread #3
==4666== at 0X804859C:THREADFN (in/home/yanghao/desktop/testc/testmem/a.out)
==4666== by 0x4026f60:mythread_wrapper (hg_intercepts.c:221)
==4666== by 0x4048e98:start_thread (pthread_create.c:304)
==4666== by 0x412e9ed:clone (clone. s:130)
==4666==
49c0b70:3
......
55c1b70:51
==4666==
==4666== for counts of detected and suppressed errors, rerun with:-V
==4666== use--history-level=approx or =none to gain increased speed, at
==4666== the cost of reduced accuracy of conflicting-access information
==4666== ERROR Summary:8 errors from 2 contexts (suppressed:99 from 31)

Helgrind successfully found the location of the race, as shown in red.

5. Massif

The stack analyzer, which measures how much memory the program uses in the stack, tells us the size of heap blocks, heap management blocks, and stacks. Massif can help us reduce memory usage, and in modern systems with virtual memory, it can also speed up the operation of our programs and reduce the chances of the program staying in the swap area.

Massif the allocation and release of memory . It allows program developers to gain insight into the memory usage behavior of the program, which optimizes memory usage. This feature is especially useful for C + +, because C + + has a lot of hidden memory allocations and releases.

In addition, Lackey and Nulgrind are also available. Lackey is a small tool, seldom used; Nulgrind just shows developers how to create a tool. We will not do the introduction.

Three use Valgrind

Valgrind is very simple to use and you don't even need to recompile your program to use it. Of course, if you want to achieve the best results, get the most accurate information, or need to recompile as required. For example , when using Memcheck, it is best to turn off optimization options.

The format of the Valgrind command is as follows:

Valgrind [valgrind-options] Your-prog [Your-prog options]

Some of the commonly used options are:

options

< strong> function

-H--help

Displays help information.

--version

Displays the version of the Valgrind kernel, with each tool having its own version.

-Q--quiet

Run quietly, printing only error messages.

-V--verbose

Print more detailed information.

--tool=<toolname> [Default:memcheck]

The most commonly used option. Run the tool named ToolName in Valgrind. If the tool name is omitted, Memcheck is run by default.

--db-attach=<yes|no> [Default:no]

Binds to the debugger for easy debugging errors.

Memory leak detection and performance analysis using Valgrind tools under Linux

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.