Ruby Profiler detailed ruby-prof (I)

Source: Internet
Author: User



Project Address: Ruby-prof



In the previous Profiling tool in Ruby, we listed several of the most commonly used profilers, but just a brief introduction, this time detailing how ruby-prof is used.



Ruby-prof is a powerful, support CPU, memory use, object allocation and so on performance analysis, and provides a lot of friendly output format, not only have text-based, HTML format, but also output Graphviz format dot file, and applicable with Kcachegrindcall treeformat, in fact, this format is based on Valgrind, this tool is very good, we can go to the official website to understand.



There are two ways to runruby-prof, one is theruby-profstart and stop code that needs to be inserted in the source:


require ‘ruby-prof’

RubyProf.start
# Write the code for performance analysis here result = RubyProf.stop

# Select a Printer printer = RubyProf :: FlatPrinter.new (result) printer.print (STDOUT)


Another is to run directly on the command line, after the Gem package is installed, theruby-profcommand is also installedruby-prof, using the following:


ruby-prof -p flat test.rb


This method is more flexible and we use this method to illustrateruby-profthe use of the method.



Directly runruby-prof -hgetruby-profhelp information, because too much, here is not listed, you can do their own in the system to see.



Where the-pparameters are output format, the following will describe the format of each Printer, the significance of the exponent and the use of related display tools. The use of several other parameters is also described in the process of introducing the output format.


Output format type
flat                   - Prints a flat profile as text (default).
flat_with_line_numbers - same as flat, with line numbers.
graph                  - Prints a graph profile as text.
graph_html             - Prints a graph profile as html.
call_tree              - format for KCacheGrind
call_stack             - prints a HTML visualization of the call tree
dot                    - Prints a graph profile as a dot file
multi                  - Creates several reports in output directory
Sample Program
def m1
  "string" * 1000
end

def m2
  "string" * 10000
end

def start
  n = 0
  n = n + 1 while n < 100_000

  10000.times do
    m1
    m2
  end
end

start


This is the most basic test program, and we'llruby-profadd additional code to the presentation as we introduce the functionality.


The impact of GC on profiling


Performance analysis of the time the GC will always have a large impact on the results, here we do not consider it, we will have another article to do a special introduction.


The simplest output format-flat
ruby-prof -p flat test.rb

Measure Mode: wall_time
Thread ID: 12161840
Fiber ID: 19223800
Total: 0.206998
Sort by: self_time

 %self      total      self      wait     child     calls  name
 68.50      0.142     0.142     0.000     0.000    20000   String#*
 10.45      0.207     0.022     0.000     0.185        1   Object#start
  6.82      0.014     0.014     0.000     0.000   100001   Fixnum#<
  6.46      0.013     0.013     0.000     0.000   100000   Fixnum#+
  2.84      0.158     0.006     0.000     0.152        1   Integer#times
  2.52      0.128     0.005     0.000     0.123    10000   Object#m2
  2.40      0.024     0.005     0.000     0.019    10000   Object#m1
  0.01      0.207     0.000     0.000     0.207        2   Global#[No method]
  0.01      0.000     0.000     0.000     0.000        2   IO#set_encoding
  0.00      0.000     0.000     0.000     0.000        3   Module#method_added

* indicates recursively called methods


First, one by one. Explain the meaning of the indicators:


Indicator explanation
%self The time taken by the method itself, excluding the other method execution time of the call
Total Total time of the method execution, including the execution time of the other method being called
Self The time that the method itself executes, not including the other method execution time of the call
Wait Multi-threaded, waiting for other threads in the time, in a single-threaded program, always 0
Child Total time of other methods called by the method
Calls Number of calls to the method


The basic relationship between them is:


total = self + wait + child


Specifically,String#*This method occupies 68.5% of the program's running time, takes 0.142 seconds, executes 20,000 times, and theObject#startmethod is defined in the codestart, which occupies 10.45% of the program run time, takes 0.022 seconds, calls the The method took 0.185 seconds, called 1 times, the total time spent is 0.022 + 0.185 = 0.207, I believe we can now name the meaning of the index white.



Now that we understand the indicator meaning of this output, if this program is a performance problem, then what is the problem with the data? Normally, we need to look at two indicators,%self and calls, simply look at%self sometimes useless, the above example, its time-consuming method isString#*that we are not likely to improve the language itself, in this case, we found that the value of calls is larger, Then try to reduce theString#*call to the method.



Using the flat output format, you can only find such a simple problem, if you want to reduceString#*the method call, you need to know who called it, and this output format is not shown, you need to choose other output format.


Simple call relationship output-graph
ruby-prof -p graph test.rb

Measure Mode: wall_time
Thread ID: 17371960
Fiber ID: 24397420
Total Time: 0.21026015281677246
Sort by: total_time

  %total   %self      total       self       wait      child            calls    Name
--------------------------------------------------------------------------------
  99.99%   0.01%      0.210      0.000      0.000      0.210                2      Global#[No method]
                      0.210      0.022      0.000      0.188              1/1      Object#start
                      0.000      0.000      0.000      0.000              3/3      Module#method_added
--------------------------------------------------------------------------------
                      0.210      0.022      0.000      0.188              1/1      Global#[No method]
  99.98%  10.34%      0.210      0.022      0.000      0.188                1      Object#start
                      0.161      0.006      0.000      0.155              1/1      Integer#times
                      0.014      0.014      0.000      0.000    100001/100001      Fixnum#<
                      0.014      0.014      0.000      0.000    100000/100000      Fixnum#+
--------------------------------------------------------------------------------
                      0.161      0.006      0.000      0.155              1/1      Object#start
  76.48%   2.68%      0.161      0.006      0.000      0.155                1      Integer#times
                      0.130      0.005      0.000      0.125      10000/10000      Object#m2
                      0.025      0.005      0.000      0.020      10000/10000      Object#m1
--------------------------------------------------------------------------------
                      0.020      0.020      0.000      0.000      10000/20000      Object#m1
                      0.125      0.125      0.000      0.000      10000/20000      Object#m2
  69.23%  69.23%      0.146      0.146      0.000      0.000            20000      String#*
--------------------------------------------------------------------------------
                      0.130      0.005      0.000      0.125      10000/10000      Integer#times
  61.81%   2.28%      0.130      0.005      0.000      0.125            10000      Object#m2
                      0.125      0.125      0.000      0.000      10000/20000      String#*
--------------------------------------------------------------------------------
                      0.025      0.005      0.000      0.020      10000/10000      Integer#times
  11.99%   2.28%      0.025      0.005      0.000      0.020            10000      Object#m1
                      0.020      0.020      0.000      0.000      10000/20000      String#*
--------------------------------------------------------------------------------
                      0.014      0.014      0.000      0.000    100001/100001      Object#start
   6.73%   6.73%      0.014      0.014      0.000      0.000           100001      Fixnum#<
--------------------------------------------------------------------------------
                      0.014      0.014      0.000      0.000    100000/100000      Object#start
   6.42%   6.42%      0.014      0.014      0.000      0.000           100000      Fixnum#+
--------------------------------------------------------------------------------
   0.01%   0.01%      0.000      0.000      0.000      0.000                2      IO#set_encoding
--------------------------------------------------------------------------------
                      0.000      0.000      0.000      0.000              3/3      Global#[No method]
   0.00%   0.00%      0.000      0.000      0.000      0.000                3      Module#method_added

* indicates recursively called methods


The content of this output is more abundant, but it may also make the head a bit dizzy. Let's take a slow look at the analysis.



First of all, the sorting method is different, is sorted according to Total_time, flat output format is sorted by Self_time. The entire report is divided into several points, each of which describes the number of method invocation information, but note that the leftmost two columns,%total,%self the two columns are not empty,



First look at the second part:


--------------------------------------------------------------------------------
                      0.210      0.022      0.000      0.188              1/1      Global#[No method]
  99.98%  10.34%      0.210      0.022      0.000      0.188                1      Object#start
                      0.161      0.006      0.000      0.155              1/1      Integer#times
                      0.014      0.014      0.000      0.000    100001/100001      Fixnum#<
                      0.014      0.014      0.000      0.000    100000/100000      Fixnum#+
--------------------------------------------------------------------------------


Object#startThe execution of the method took 99.98% of the total time, not including the child method calls, took 10.34% of the time, was called once, andstartalso calledInteger#times,Fixnum#<andFixnum#+three methods in the method.



Then the second right number (calls), is a delimited number of/two, the number on the left is how many times this method is called at this levelObject#start, the number on the right is theObject#starttotal number of runs during this program. TheObject#startthree methods that are calledcallsare listed in theObject#startnumber of executions in, and the total number of executions.



In the first part there are two methods: theGlobal#[No method]representative does not have a caller, it can be understood that Ruby is preparing the execution environment,Module#method_addedwhen there is an instance method added, this method will be triggered.



So what can this output format explain? In the flat output format we have positionedString#*too many calls to the problem, then according to the output format of this graph format we should be able to find out who caused the problem.



First cut out the parts that can find the problem:


--------------------------------------------------------------------------------
                      0.020      0.020      0.000      0.000      10000/20000      Object#m1
                      0.125      0.125      0.000      0.000      10000/20000      Object#m2
  69.23%  69.23%      0.146      0.146      0.000      0.000            20000      String#*
--------------------------------------------------------------------------------
                      0.130      0.005      0.000      0.125      10000/10000      Integer#times
  61.81%   2.28%      0.130      0.005      0.000      0.125            10000      Object#m2
                      0.125      0.125      0.000      0.000      10000/20000      String#*
--------------------------------------------------------------------------------
                      0.025      0.005      0.000      0.020      10000/10000      Integer#times
  11.99%   2.28%      0.025      0.005      0.000      0.020            10000      Object#m1
                      0.020      0.020      0.000      0.000      10000/20000      String#*
--------------------------------------------------------------------------------


The first part shows thatString#*inObject#m1andObject#m1in each was called 10,000 times, a total of 20,000 times, the same number of times, then look at the following, also 10,000 times, in theObject#m2time spent is 0.125 seconds, and inObject#m1The time spent is 0.020 seconds, more than 0.105 seconds, so that we can locate the problem isObject#m2here.



Graph can be output to HTML format, here just demonstrates the plain text version, HTML format more easy to interact, need to add parameter-F to specify the path and file name of the output.


GraphViz Dot-dot
ruby-prof -p dot test.rb -f dot.dot


There are tools to convert dot files to PDF viewing, as well as tools that specialize in viewing dot files, such as XDot on Ubuntu.






This picture also clearly shows that the problem isObject#m2here.


Interactive Call relationships-Call_stack
ruby-prof -p call_stack test.rb -f callstack.html





Here is really a picture win thousand words, at a glance,Object#m2in theString#*10,000 calls spent 60.52% of the time, do not have to explain, hurry up and try it yourself.


Ultimate Universal Full View-Call_tree


First installedKCacheGrind, Ubuntu under directsudo apt-get install kcachegrind


ruby-prof -p call_tree test.rb -f call_tree.out


OpenKCacheGrind, then opencall_tree.out(file type Select all), this magical tool can present to you all the truth.


















With the output format described above, it is easy to understand this, we will introduce, but in another article, because this is a bit too long, the next article will detail the use of kcachegrind.



This article is compiled and collated by ONEAPM engineers. ONEAPM is the emerging leader in China's basic software industry, helping enterprise users and developers easily implement slow program code and real-time crawling of SQL statements. To read more technical articles, please visit the ONEAPM Official technology blog.



Ruby Profiler detailed ruby-prof (I)


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.