Valgrind Callgrind tool for multi-threaded performance analysis

Source: Internet
Author: User
Tags valgrind

1.http://valgrind.org/downloads/old.html
2.yum Install Valgrind

Valgrind's main author, Julian Seward, has just won the ──best Tool Maker, one of the Google-o ' Reilly Open Source Awards this year. Let's look at his work together. Valgrind is a set of simulation-based program debugging and analysis tools running on Linux that contains a kernel-a software-synthesized CPU, and a series of gadgets, each of which can perform a task-debugging, analysis, or testing. Valgrind can detect memory leaks and memory violations, and can analyze the use of the cache, and so on, flexible and powerful, can be straight through the wrong heart, it is the programmer's Swiss Army knife.
I. Valgrind overview
The latest version of Valgrind is 3.2.0, which typically contains the following tools:
1.Memcheck
The most common tool used to detect memory problems in the program, all memory reads and writes are detected, and all calls to malloc ()/free ()/new/delete are captured. Therefore, it can detect the following problems:
1. Use of uninitialized memory;
2. Read/write the memory block after release;
3. Read/write memory blocks that exceed malloc allocations;
4. Read/write inappropriate memory blocks in the stack;
5. Memory leaks, pointers to a piece of memory are lost forever;
6. Incorrect malloc/free or new/delete matching;
The DST and src pointers overlap in the 7,memcpy () correlation function.
These problems are often the most vexing problem for C + + programmers, and Memcheck is very helpful here.

2.Callgrind
and Gprof similar analysis tools, but it to the operation of the program observation is more nuanced, can provide us with more information. Unlike gprof, it does not require additional special options when compiling source code, but it is recommended to add debugging options. Callgrind collects some data from the program runtime, establishes a function call graph, and optionally carries out a cache simulation. At the end of the run, it writes the parsing data to a file. Callgrind_annotate can convert the contents of this file into a readable form.

3.Cachegrind
The cache parser, which simulates the first-level cache i1,dl and level two caches in the CPU, can pinpoint the loss and hit of the cache in the program. It can also provide us with the number of cache misses, memory references, and each line of code, each function, each module, and the entire program, if needed. This is a great help in optimizing the program.

4.Helgrind
It is primarily used to check for competition issues that occur in multithreaded programs. Helgrind looks for areas in memory that are accessed by multiple threads without a consistent lock, which are often places where threads are out of sync and can cause hard-to-dig errors. Helgrind implements a competitive detection algorithm called "Eraser", and makes further improvements to reduce the number of reported errors. However, Helgrind is still in the experimental stage.

5. Massif
The stack analyzer, which measures how much memory the program uses in the stack, tells us the size of heap blocks, heap management blocks, and stacks. Massif can help us reduce memory usage, and in modern systems with virtual memory, it can also speed up the operation of our programs and reduce the chances of the program staying in the swap area.
In addition, Lackey and Nulgrind are also available. Lackey is a small tool, seldom used; Nulgrind just shows developers how to create a tool. We will not do the introduction.

Two. Using Valgrind
The Valgrind is very simple to use and the format of the Valgrind command is as follows:
Valgrind [valgrind-options] Your-prog [Your-prog options]
Some of the commonly used options are:
Options
Role
-H--help
Displays help information.
--version
Displays the version of the Valgrind kernel, with each tool having its own version.
-Q--quiet
Run quietly, printing only error messages.
-V--verbose
Print more detailed information.
--tool= [Default:memcheck]
The most commonly used option. Run the tool named ToolName in Valgrind. If the tool name is omitted, Memcheck is run by default.
--db-attach= [Default:no]
Binds to the debugger for easy debugging errors.
Let's take a look at the specific use of this example. We construct a C program that has a memory leak, as follows:
#include
#include
void f (void)
{
int* x = malloc (* sizeof (int));
X[10] = 0; Problem 1:heap Block overrun
}//Problem 2:memory leak--X not freed
int main (void)
{
int i;
f ();
printf ("i=%d/n", I); Problem 3:use uninitialised value.
return 0;
}
Save as MEMLEAK.C and compile, then use Valgrind to detect.
$ Gcc-wall-o Memleak memleak.c
$ valgrind--tool=memcheck./memleak
We get the following error message:
==3649== Invalid Write of size 4
==3649== at 0x80483cf:f (in/home/wangcong/memleak)
==3649== by 0x80483ec:main (In/home/wangcong/memleak)
==3649== Address 0x4024050 is 0 bytes after a block of size alloc ' d
==3649== at 0x40051f9:malloc (vg_replace_malloc.c:149)
==3649== by 0x80483c5:f (In/home/wangcong/memleak)
==3649== by 0x80483ec:main (In/home/wangcong/memleak)
The previous 3649 is the process number at the time the program was run. The first line is to tell us the type of error, here is the illegal write. The following is the location that tells us where the error occurred, in the F () function called by Main ().
==3649== use of uninitialised value of size 4
==3649== at 0xc3a264: _itoa_word (in/lib/libc-2.4.so)
==3649== by 0xc3e25c:vfprintf (in/lib/libc-2.4.so)
==3649== by 0xc442b6:printf (in/lib/libc-2.4.so)
==3649== by 0x80483ff:main (In/home/wangcong/memleak)
This error is using an uninitialized value in the printf () function called by Main (). The function call relationships here are tracked through the stack, so there are times when you use C + + 's STL. Some other errors are detected by passing uninitialized values to the LIBC function. After the program runs, Valgrind gives a small summary:
==3649== ERROR summary:20 Errors from 6 contexts (Suppressed:12 from 1)
==3649== malloc/free:in use at exit:40 bytes in 1 blocks.
==3649== Malloc/free:1 Allocs, 0 frees, bytes allocated.
==3649== for counts of detected errors, rerun with:-V
==3649== searching for pointers to 1 not-freed blocks.
==3649== checked 47,256 bytes.
==3649==
==3649== LEAK SUMMARY:
==3649== definitely lost:40 bytes in 1 blocks.
==3649== possibly lost:0 bytes in 0 blocks.
==3649== still reachable:0 bytes in 0 blocks.
==3649== suppressed:0 bytes in 0 blocks.
==3649== use--leak-check=full to see details of leaked memory.
We can clearly see how much memory is allocated and freed, and how much memory leaks. This is very convenient for us to find a memory leak. Then we recompile the program and bind the debugger:
$ gcc-wall-ggdb-o Memleak memleak.c
$ valgrind--db-attach=yes--tool=memcheck./memleak
In the event of an error, Valgrind will automatically start the debugger (typically GDB):
==3893==----Attach to debugger? ---[return/n/n/y/y/c/c]----Y
Starting debugger
==3893== starting debugger with cmd:/usr/bin/gdb-nw/proc/3895/fd/1014 3895
After exiting GDB we can go back to Valgrind and continue executing the program.
or using the above procedure, we use Callgrind to analyze its efficiency:
$ valgrind--tool=callgrind./memleak
Callgrind will output a lot, and finally generate a file in the current directory: Callgrind.out.pid. Use Callgrind_annotate to view it:
$ callgrind_annotate callgrind.out.3949
The detailed information is listed. And, when Callgrind runs your program, you can also use Callgrind_control to observe the execution of the program without interfering with its operation.
Take a look at the performance of Cachegrind:
$ valgrind--tool=cachegrind./memleak
Get the following information:
==4073== I refs:147,500
==4073== I1 misses:1,189
==4073== L2i misses:679
==4073== I1 Miss rate:0.80%
==4073== L2i Miss rate:0.46%
==4073==
==4073== D refs:61,920 (46,126 Rd + 15,794 WR)
==4073== D1 misses:1,759 (1,545 Rd + 214 WR)
==4073== l2d misses:1,241 (1,062 Rd + 179 WR)
==4073== D1 Miss rate:2.8% (3.3% + 1.3%)
==4073== l2d Miss rate:2.0% (2.3% + 1.1%)
==4073==
==4073== L2 refs:2,948 (2,734 Rd + 214 WR)
==4073== L2 misses:1,920 (1,741 Rd + 179 WR)
==4073== L2 Miss rate:0.9% (0.8% + 1.1%)
Above is the instruction cache, I1 and L2i cache, the access information, including the total number of visits, the number of lost, the loss rate.
The middle is the data cache, D1 and l2d cache, the access information of the L2 cache, the separate information below. Cachegrind also generates a file named Cachegrind.out.pid, which can be read by cg_annotate. The output is a more detailed list. The use of massif is similar to Cachegrind, but it also generates a PostScript file called Massif.pid.ps, with only a color map describing the usage of the stack.
The above is simply a demonstration of the use of valgrind, more information can be found in its accompanying documentation, you can also visit Valgrind's homepage: http://www.valgrind.org. Learning to use valgrind correctly and rationally can be a great help for debugging programs.

Multi-threaded performance analysis using Valgrind's Callgrind tool Original HTTP://BLOGREAD.CN/IT/ARTICLE/7168?F=HOT1 ThemePerformance AnalysisValgrindMultithreading

Brief introduction

Valgrind is an open source performance analysis tool. Based on its documentation, you can use it to check for problems such as memory leaks, and also to generate call graphs for functions that are attractive enough for these two functions.

This article is mainly about how to use Valgrind's Callgrind tool for performance analysis.

Analysis process

Generating profiling data using the Callgrind tool

The command format is as follows:

Valgrind--tool=callgrind. /exproxy

Among them./exproxy is the program we are going to analyze. Once execution is complete, a file is generated in the current directory. The file is named "Callgrind.out. Process number". For example, callgrind.out.31113. Note that for daemon process debugging, do not stop by kill-9 mode.

If you are debugging a multi-threaded program, you can also add a parameter-separate-threads=yes to the command line. This will generate a separate profiling file for each thread. As follows:

Valgrind--tool=callgrind--separate-threads=yes. /exproxy

The resulting file will also have a number of sub-threading files in addition to callgrind.out.31113. The file name is as follows:

callgrind.out.31113-01 callgrind.out.31113-02 callgrind.out.31113-03

Convert Callgrind generated performance data into dot format data

You can use the gprof2dot.py script to convert the performance analysis data generated by callgrind into dot format data. It is easy to use dot to graphically visualize analytical data.

The script can click this download. Scripts are used in the following ways:

Python gprof2dot.py-f callgrind-n10-s callgrind. out. 31113 > Valgrind.dot 
Use dot to generate a picture of your data

The command format is as follows:

Dot-tpng Valgrind.dot-o Valgrind.png
Example of a generated picture

Through the graph, we can intuitively know that the program is slow to execute, and understand the related call relationship

Valgrind Callgrind tool for multi-threaded performance analysis

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.