Analysis of slow query logs by pt-query-digest in MySQL, ptquerydigest
I. Introduction
Pt-query-digest is a tool used to analyze mysql slow queries. It can analyze binlog, General log, and slowlog, you can also use SHOWPROCESSLIST or MySQL protocol data captured by tcpdump for analysis. You can output the analysis results to a file. The analysis process is to first parameterize the query statement conditions, and then perform grouping statistics on the query after parameterization, calculate the execution time, number of times, and proportion of each query. You can use the analysis results to find out the problem and optimize it.
Ii. Install pt-query-digest
1. Download Page: https://www.percona.com/doc/percona-toolkit/2.2/installation.html
2. perl Module
yum install -y perl-CPAN perl-Time-HiRes
3. Installation Steps
Method 1: rpm Installation
cd /usr/local/srcwget percona.com/get/percona-toolkit.rpmyum install -y percona-toolkit.rpm
The tool installation directory is:/usr/bin.
Method 2: Install with source code
cd /usr/local/srcwget percona.com/get/percona-toolkit.tar.gztar zxf percona-toolkit.tar.gzcd percona-toolkit-2.2.19perl Makefile.PL PREFIX=/usr/local/percona-toolkitmake && make install
The tool installation directory is/usr/local/percona-toolkit/bin.
4. Introduction to the usage of each tool (details: https://www.percona.com/doc/percona-toolkit/2.2/index.html)
(1) Slow query log analysis statistics
pt-query-digest /usr/local/mysql/data/slow.log
(2) server Summary
pt-summary
(3) server disk monitoring
pt-diskstats
(4) mysql Service Status Summary
pt-mysql-summary -- --user=root --password=root
Iii. pt-query-digest syntax and important options
- Pt-query-digest [OPTIONS] [FILES] [DSN]
- -- Create-review-table when the -- review parameter is used to output the analysis result to the table, it is automatically created if no table exists.
- -- Create-history-table when the -- history parameter is used to output the analysis result to the table, it is automatically created if no table exists.
- -- Filter: matches and filters input slow queries based on specified strings before analysis.
- -- Limit limits the percentage or quantity of output results. The default value is 20, that is, the output of the slowest 20 statements. If the value is 50%, the results are sorted from large to small according to the total response time proportion, output to the end of the total reaches 50%.
- -- Host mysql server address
- -- User mysql user name
- -- Password mysql user password
- -- History saves the analysis results to the table, and the analysis results are more detailed. If the same statement exists next time -- history is used, and the time range of the query is different from that in the history table, it is recorded in the data table. You can query the same CHECKSUM to compare the historical changes of a certain type of query.
- -- Review saves the analysis results to the table. This analysis only parameterizes the query conditions and queries a record of a type, which is relatively simple. When -- review is used next time, if the same statement analysis exists, it is not recorded in the data table.
- -- Output analysis result output type. The values can be report (Standard Analysis report), slowlog (Mysql slow log), json, and json-anon. Generally, report is used for ease of reading.
- -- Start time of since analysis. The value is a string, which can be a specified time point in the format of "yyyy-mm-dd [hh: mm: ss, it can also be a simple Time Value: s (seconds), h (hours), m (minutes), d (days). For example, 12 hours indicates that statistics are started from 12 hours ago.
- -- Until end time, combined with-since, can analyze slow queries within a period of time.
Iv. Analyze pt-query-digest output results
Part 1: Overall statistics
Overall: Total number of queries
Time range: query execution Time range
Unique: the number of unique queries, that is, the total number of Different queries after the query conditions are parameterized
Total: total min: min max: max avg: Average
95%: Sort all values from small to large in the number at the position of 95%. This number is generally the most reference value.
Median: median, which sorts all values from small to large and positions them in the middle.
# User time, system time, physical memory usage, and virtual memory usage of the tool for log analysis #340 ms user time, 140 ms system time, 23.99 M rss, 203.11 M vsz # tool execution time # Current date: Fri Nov 25 02:37:18 2016 # Host Name of the running analysis tool # Hostname: localhost. localdomain # analyzed file name # Files: slow. log # total number of statements, unique statement quantity, QPS, concurrency # Overall: 2 total, 2 unique, 0.01 QPS, 0.01x concurrency ____________ # Time range of log records # Time range: 06:06:18 to 06:11:40 # minimum maximum average Attribute 95% standard medium # Attribute total min max avg 95% stddev median #============== ========================================================== ====# statement execution time # Exec time 3 s 640 ms 2 s 1 s 2 s 999 ms 1 s # Lock occupation time # Lock time 1 ms 0 1 ms 723us 1 ms 1 ms 723us # number of lines sent to the client # Rows sent 5 1 4 2.50 4 2.12 2.50 # number of lines scanned by the select statement # Rows examine 186.17 k 0 186.17 k 93.09 k 186.17 k 131.64 k # number of characters to Query # Query size 455 15 440 227.50 440 300.52 227.50
Part 2: Query group statistics
Rank: Rank of all statements. by default, it is sorted in descending order of query time. It is specified by -- order-.
Query ID: Statement ID (remove unnecessary spaces and text characters to calculate the hash value)
Response: total Response time
Time: Total time proportion of the query in this analysis
CILS: number of executions, that is, the total number of queries of this type in this analysis
R/Call: average response time for each execution
V/M: Ratio of the response time Variance-to-mean
Item: query object
# Profile# Rank Query ID Response time Calls R/Call V/M Item# ==== ================== ============= ===== ====== ===== ===============# 1 0xF9A57DD5A41825CA 2.0529 76.2% 1 2.0529 0.00 SELECT# 2 0x4194D8F83F4F9365 0.6401 23.8% 1 0.6401 0.00 SELECT wx_member_base
Part 3: detailed statistics of each query
According to the detailed statistics of the following queries, the top table lists the statistics of execution times, maximum, minimum, average, and 95%.
ID: ID of the Query, corresponding to the Query ID
Databases: Database Name
Users: number of executions by each user (proportion)
Query_time distribution: Query time distribution, which indicates the ratio of the interval. In this example, the number of queries between 1 s and 10 s is twice that of 10 s.
Tables: Tables involved in the query
Explain: SQL statement
# Query 1: 0 QPS, 0x concurrency, ID 0xF9A57DD5A41825CA at byte 802 ______# This item is included in the report because it matches --limit.# Scores: V/M = 0.00# Time range: all events occurred at 2016-11-22 06:11:40# Attribute pct total min max avg 95% stddev median# ============ === ======= ======= ======= ======= ======= ======= =======# Count 50 1# Exec time 76 2s 2s 2s 2s 2s 0 2s# Lock time 0 0 0 0 0 0 0 0# Rows sent 20 1 1 1 1 1 0 1# Rows examine 0 0 0 0 0 0 0 0# Query size 3 15 15 15 15 15 0 15# String:# Databases test# Hosts 192.168.8.1# Users mysql# Query_time distribution# 1us# 10us# 100us# 1ms# 10ms# 100ms# 1s ################################################################# 10s+# EXPLAIN /*!50100 PARTITIONS*/select sleep(2)\G
V. Usage example
1. Directly analyze the slow Query file:
pt-query-digest slow.log > slow_report.log
2. Analyze the queries in the last 12 hours:
pt-query-digest --since=12h slow.log > slow_report2.log
3. Analyze queries within a specified time range:
pt-query-digest slow.log --since '2017-01-07 09:30:00' --until '2017-01-07 10:00:00'> > slow_report3.log
4. analysis refers to slow queries containing select statements
pt-query-digest --filter '$event->{fingerprint} =~ m/^select/i' slow.log> slow_report4.log
5. Slow query for a user
pt-query-digest --filter '($event->{user} || "") =~ m/^root/i' slow.log> slow_report5.log
6. query all slow queries for full table scans or full join
pt-query-digest --filter '(($event->{Full_scan} || "") eq "yes") ||(($event->{Full_join} || "") eq "yes")' slow.log> slow_report6.log
7. Save the query to the query_review table
pt-query-digest --user=root –password=abc123 --review h=localhost,D=test,t=query_review--create-review-table slow.log
8. Save the query to the query_history table.
pt-query-digest --user=root –password=abc123 --review h=localhost,D=test,t=query_history--create-review-table slow.log_0001pt-query-digest --user=root –password=abc123 --review h=localhost,D=test,t=query_history--create-review-table slow.log_0002
9. Capture mysql tcp protocol data through tcpdump, and then analyze
tcpdump -s 65535 -x -nn -q -tttt -i any -c 1000 port 3306 > mysql.tcp.txtpt-query-digest --type tcpdump mysql.tcp.txt> slow_report9.log
10. Analyze binlog
mysqlbinlog mysql-bin.000093 > mysql-bin000093.sqlpt-query-digest --type=binlog mysql-bin000093.sql > slow_report10.log
11. analyze the general log
pt-query-digest --type=genlog localhost.log > slow_report11.log
Summary
The above is all the content of this article. I hope the content of this article will help you in your study or work. If you have any questions, please leave a message, thank you for your support.