This article describes the MySQL performance bottleneck to troubleshoot the location of the method. Share to everyone for your reference, specific as follows:
Guide
Speaking from a site, the whole process to resolve how to locate performance bottlenecks.
Troubleshooting Procedures
On the line of a business back-end of the MySQL instance load relatively high alarm information, so log into the server check confirm.
1. First, we do the OS level check confirmation
After logging in to the server, our goal is to first make sure which processes are currently causing high loads and where these processes are stuck and what the bottleneck is.
Typically, the disk I/O subsystem is the easiest bottleneck on a server because it is usually the slowest read and write speed. Even today's PCIe SSD, its random I/O read and write speed is not as fast as memory. Of course, there are many reasons why disk I/O is slow, and you need to be sure which one is causing it.
The first step, we generally look at the overall load, the load is high, sure that all processes run slowly.
You can perform instructions W or sar-q to view the load data, for example:
[yejr@imysql.com:~]# W
11:52:58 up 702 days, min, 1 user, load average:7.20, 6.70, 6.47
user TTY from login@ IDLE jcpu pcpu WHAT
root pts/0 1.xx.xx.xx 11:51 0.00s 0.03s 0.00s W
Or Sar-q's observations:
[yejr@imysql.com:~]# sar-q 1
Linux 2.6.32-431.el6.x86_64 yejr.imysql.com 01/13/2016 (24 CPU)
02:51:18 pm runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15 blocked 02:51:19
pm 4 2305 6.41 6.98 7.12 3
02:51:20 PM 2 2301 6.41 6.98 7.12 4
02:51:21 pm 0 2300 6.41 6.98 7.12 5
02:51:22 pm 6 2301 6.41 6.98 7.12 8
02:51:23 PM 2 2290 6.41 6.98 7.12 8
Load average the effect of the current CPU on how many tasks in the queue, waiting for more description load higher, running database server, the general load value of more than 5, it is relatively high.
There may also be several reasons for the high load:
Some processes/services consume more CPU resources (services respond to more requests or have some application bottlenecks);
A more severe swap occurs (insufficient physical memory available);
A more serious interruption occurs (as a result of an SSD or network interruption);
Disk I/O is slower (causing the CPU to wait for disk I/O requests);
We can then execute the following command to determine which subsystem the bottleneck is in:
[yejr@imysql.com:~]# top
top-11:53:04 up 702 days, min, 1 user, load average:7.18, 6.70, 6.47 tasks:576 to
Tal, 1 running, 575 sleeping, 0 stopped, 0 zombie
Cpu (s): 7.7%us, 3.4%sy, 0.0%ni, 77.6%id, 11.0%wa, 0.0%h I, 0.3%si, 0.0%st
mem:49374024k Total, 32018844k used, 17355180k free, 115416k buffers
swap:16777208k Total , 117612k used, 16659596k free, 5689020k cached
PID USER PR NI virt RES SHR S%cpu%mem time+ command< c14/>14165 MySQL 0 8822m 3.1g 4672 S 162.3 6.6 89839:59 mysqld
40610 mysql 0 25.6g 14g 8336 s 121.7 31.5 282809:08 mysqld
49023 mysql 0 16.9g 5.1g 4772 S 4.6 10.8 34940:09 mysqld
It is clear that the previous two mysqld processes resulted in a higher overall load.
Also, the statistics from the CPU (s) can be seen, the%us and%wa values are higher, indicating that the current larger bottleneck may be the CPU consumed by the user process and disk I/O wait.
Let's analyze the disk I/O situation first.
Perform sar-d to verify that disk I/O is really large:
[yejr@imysql.com:~]# sar-d 1
Linux 2.6.32-431.el6.x86_64 yejr.imysql.com 01/13/2016 (24 CPU)
11:54:32 AM dev8-0 5338.00 162784.00 1394.00 30.76 5.24 0.98 0.19 100.00
11:54:33 AM dev8-0 5134.00 148032.00 32365.00 35.14 6.93 1.34 0.19 100.10
11:54:34 AM dev8-0 5233.00 161376.00 996.00 31.03 9.77 1.88 0.19 100.00
11:54:35 AM dev8-0 4566.00 139232.00 1166.00 30.75 5.37 1.18 0.22 100.00
11:54:36 AM dev8-0 4665.00 145920.00 630.00 31.41 5.94 1.27 0.21 100.00
11:54:37 AM dev8-0 4994.00 156544.00 546.00 31.46 7.07 1.42 0.20 100.00
Use Iotop to determine exactly which processes consume the most disk I/O resources:
[yejr@imysql.com:~]# iotop total
DISK read:60.38 m/s | Total DISK write:640.34 k/s
TID prio USER disk READ disk WRITE swapin io> COMMAND
16397 be/4 MySQL 8.92 /m 0.00 B/s 0.00% 94.77% mysqld--basedir=/usr/local/m~og_3320/mysql.sock--port=3320 7295 be/4 mysq
L 10.98 /m 0.00 B/s 0.00% 93.59% mysqld--basedir=/usr/local/m~og_3320/mysql.sock--port=3320 14295 be
/4 MySQL 10.50/M 0.00 B/s 0.00% 93.57%
mysqld--basedir=/usr/local/m~og_3320/mysql.sock 14288 be/4 mysql 14.30 /m 0.00 B/s 0.00% 91.86% mysqld--basedir=/usr/local/m~og_3320/mysql.sock--port=3320
14292 be/4 mysql 14.37 /m 0.00 B/s 0.00% 91.23% mysqld--basedir=/usr/local/m~og_3320/mysql.sock--por t=3320
As you can see, the number of disk I/O resources consumed by the port number is 3320, so look at the query running in this example.
2. mysql level check Confirmation
First look at what queries are currently running:
[Yejr@imysql.com (db)]> mysqladmin pr|grep-v sleep +----+----+----------+----+-------+-----+--------------+----- ------------------------------------------------------------------------------------------+
| Id | user| Host | db | command| Time | State |
Info | +----+----+----------+----+-------+-----+--------------+------------------------------------------------------- ----------------------------------------+
| 25 | x | 10.x:8519 | db | Query | 68 | Sending Data | Select Max (fvideoid) from (select Fvideoid from T where fvideoid>404612 Order by fvideoid) T1 | | 26 | x | 10.x:8520 | db | Query | 65 | Sending Data | Select Max (fvideoid) from (select Fvideoid from T where fvideoid>484915 Order by fvideoid) T1 | | 28 | x | 10.x:8522 | db | Query | 130 | Sending Data | Select Max (fvideoid) from (select Fvideoid from T where fvideoid>404641 Order by fvideoid) T1 | | 27 | x | 10.x:8521 | db | Query | 167 | Sending Data | Select Max (Fvideoid) from (select Fvideoid from T where fvideoid>324157 Order by fvideoid) T1 | | 36 | x | 10.x:8727 | db | Query | 174 | Sending Data |
Select Max (fvideoid) from (select Fvideoid from T where fvideoid>324346 Order by fvideoid) T1 | +----+----+----------+----+-------+-----+--------------+-------------------------------------------------------
----------------------------------------+
You can see that a lot of slow queries have not yet been completed, and that this type of SQL occurs very frequently, as can be found in the slow query log.
This is a very inefficient SQL notation, which results in the need to scan the entire primary key, but in fact only a maximum value is obtained, as seen from the slow query log:
Rows_sent:1 rows_examined:5502460
It is very inefficient to scan more than 5 million rows of data at a time, but only to read one of the maximum values.
After analysis, this simple transformation of the SQL can be done in single-digit milliseconds, the original is required 150-180 seconds to complete, the promotion of n times.
The transformation method is: The query results do a reverse order, get the first record can be. And the original approach is to order the results of the order, take the last record, Khan Ah ...
Write in the end, summary
In this example, the cause of bottlenecks is better positioned, SQL optimization is not difficult, in the actual online environment, there are usually several common reasons for higher load:
A request to read and write data is too large, resulting in a large disk I/O read and write values, such as a SQL to read or update tens of thousands of rows of data or more, this is best to find ways to reduce the amount of data read and write;
There is no appropriate index in the SQL query that can be used to complete conditional filtering, ordering (order by), grouping (group by), data aggregation (MIN/MAX/COUNT/AVG, etc.), adding an index or making a SQL rewrite.
Sudden burst of a large number of requests, this generally as long as can be carried over the peak, insurance or to properly improve the configuration of the server, in the event of peak resistance to the past may have avalanche effect;
Because of the increased load caused by some timed tasks, such as data statistics analysis and backup, this CPU, memory, disk I/O consumption is very high, preferably on a separate slave server execution;
The server's own energy-saving strategy found that the lower load will allow the CPU to reduce frequency, when the discovery of higher load when the automatic rise frequency, but usually not so timely, the result of the CPU performance is insufficient, resist the unexpected request;
When using a RAID card, usually equipped with BBU (cache module standby battery), the early general use of lithium battery technology, the need for regular charge and discharge (Dell Server 90 days, IBM is 30 days), we can monitor the next charge and discharge time before the business trough before the discharge of it, But the new generation of servers mostly using capacitive batteries, there is no such problem.
File system using EXT4 even ext3, rather than XFS, in high I/O pressure, it is likely to lead to%util has run to 100%, but ioPS can not be upgraded, with XFS can generally be significantly improved;
The kernel's IO scheduler strategy uses CFQ rather than deadline or noop, which can be directly adjusted online, and can be significantly improved.
More information about MySQL interested readers can view the site topics: "MySQL Log operation skills Daquan", "MySQL Transaction operation skills Summary", "MySQL stored process skills encyclopedia", "MySQL database lock related skills summary" and "MySQL commonly used function large summary"
I hope this article will help you with the MySQL database meter.