High IO usage in MySQL

Source: Internet
Author: User


1. Log generation performance impact:

The direct performance loss caused by log records is the most expensive IO resource in the database system. MySQL logs include ErrorLog, UpdateLog, Binlog, QueryLog, and SlowQueryLog. Of course, the update log is only available in the old version of MySQL and has been replaced by binary logs.

By default, the system only opens error logs and closes all other logs to minimize IO consumption and improve system performance. However, in actual application scenarios that are a little more important, you must at least enable the binary log, because this is the basis for many MySQL storage engines to perform incremental backup and is also the basic condition for MySQL to implement replication. Sometimes, to further optimize the performance, locate slow SQL statements. Many systems also open slow query logs to record SQL statements whose execution time exceeds a specific value (set by ourselves.

Generally, query logs are rarely opened in the production system. Because after the Query log is opened, every Query executed in MySQL is recorded in the log, this system will bring a large IO burden, but the actual benefits are not very large. Generally, this log can be opened for analysis in a short period of time only when some functions use SQL statements in the development and testing environment. Therefore, in the MySQL system, MySQL logs that affect the performance (excluding the logs of each storage engine) are mainly binlogs.

2. Execute the following commands in mysql:

Set global sync_binlog = 500;

After every 500 transaction commits, MySQL will execute a disk synchronization command such as fsync to forcibly write data in binlog_cache to the disk.

Set global innodb_flush_log_at_trx_commit = 2;

The default value 1 indicates that logs need to be written to the hard disk (flush) for each transaction commit or non-transactional command. This is time-consuming. Especially when Battery backed up cache is used. 2 indicates that the system cache is written instead of the hard disk. Logs are flushed to the hard disk every second, so you will not lose updates that exceed 1-2 seconds. Setting 0 is faster, but the security is poor. Even if MySQL fails, the transaction data may be lost. If the value is set to 2, data may be lost only when the entire operating system is down.

Note: The Command becomes invalid after the instance is restarted. You can set the above two items when starting the service.


In a temporary table, IO is too high.

[Symptom]

A slow query occurred in the online mysql database. DBA observed that the IO usage of the server soared during the query, reached 100%, and the execution time was about 7 s.

The SQL statement is as follows:

Select distinct g. *, cp. name AS cp_name, c. name AS category_name, t. name AS type_name FROMgm_game g
Left join gm_cp cp ON cp. id = g. cp_id AND cp. deleted = 0
Left join gm_category c ON c. id = g. category_id AND c. deleted = 0 \
Left join gm_type t ON t. id = g. type_id AND t. deleted = 0 WHERE g. deleted = 0 order by g. modify_time desc limit 20;
[Problem analysis]

Use explain to view the execution plan. The results are as follows:

Mysql execution plan

The problem with this SQL statement is obvious: it queries a large amount of data (including the number of data entries, and g. *), and then use the temporary table order by, but only 20 data records are returned.

The DBA observed high IO because the SQL statement generated a huge temporary table and the memory could not be stored. Therefore, all the data was copied to the disk, causing IO to soar.

[Optimization solution]

The general idea of optimization is to split SQL statements and separate sorting operations from Querying all information.

The first statement is to query the qualified data. You only need to query g. id.

Select distinct g. id FROM gm_game g
Left join gm_cp cp ON cp. id = g. cp_id AND cp. deleted = 0
Left join gm_category c ON c. id = g. category_id AND c. deleted = 0
Left join gm_type t ON t. id = g. type_id AND t. deleted = 0
WHERE g. deleted = 0 order by g. modify_time desc limit 20;
The second statement: query the detailed data that meets the conditions, and splice the results of the first SQL statement with the in operation to the second SQL statement.

Select distinct g. *, cp. name AS cp_name, c. name AS category_name, t. name AS type_name FROM gm_game g
Left join gm_cp cp ON cp. id = g. cp_id AND cp. deleted = 0
Left join gm_category c ON c. id = g. category_id AND c. deleted = 0
Left join gm_type t ON t. id = g. type_id AND t. deleted = 0
WHERE g. deleted = 0 and g. id in (.....................) Order by g. modify_time DESC;
[Tested results]

Testing on SATA machines requires about 50 s before optimization, the first 0.3 s after optimization, and the second 0.1 s after Optimization. After optimization, the execution speed is more than 100 times the original speed, IO reduced from 100% to less than 1%

Testing on an SSD machine requires about 7 s before optimization, the first 0.3 s after optimization, and the second 0.1 s after Optimization. After optimization, the execution speed is more than 10 times the original speed, IO reduced from 100% to less than 1%

We can see that the disk I/O is the performance bottleneck before optimization, and the SSD speed is much faster than SATA. After optimization, the disk is no longer the bottleneck, and there is no difference in SSD and SATA performance.

[Theoretical analysis]

MySQL may use temporary tables when executing SQL queries. Generally, using temporary tables means low performance.

Temporary table storage

MySQL temporary tables are divided into "MEMORY temporary tables" and "disk temporary tables". The MEMORY temporary tables use the MEMORY storage engine of MySQL, and the disk temporary tables use the MyISAM storage engine of MySQL;

In general, MySQL creates a temporary memory table first, but after the temporary memory table exceeds the value specified in the configuration, MySQL exports the temporary memory table to the temporary disk table;

On the Linux platform, the/tmp directory is used by default. Note that the/tmp directory is small.

Use of temporary tables

1) the order by clause is different from the group by clause, for example, ordery by price group by name;
2) in a JOIN query, order by or group by uses columns other than the first table, for example: SELECT * from TableA, TableB order by TableA. price GROUP by TableB. name
3) the DISTINCT keyword ordery by distinct (price) is used in order)
4) If the SQL _SMALL_RESULT keyword SQL _SMALL_RESULT is specified in the SELECT statement, MySQL is notified that the result will be very small. Use the memory temporary table directly, you do not need to use the index sorting SQL _SMALL_RESULT. It must be used with GROUP BY, DISTINCT, or DISTINCTROW. In general, we do not need to use this option for MySQL server selection.

Direct use of temporary disk tables

1) the table contains TEXT or BLOB columns;
2) the group by or DISTINCT clause contains columns with a length greater than 512 bytes;
3) when UNION or union all is used, the SELECT clause contains columns greater than 512 bytes;
Temporary table configuration

Tmp_table_size: specifies the maximum size of the memory temporary table created by the system; http://dev.mysql.com/doc/refman/5.1/en/server-system-variables.html#sysvar_tmp_table_size

Max_heap_table_size: specifies the maximum size of the memory table created by the user; http://dev.mysql.com/doc/refman/5.1/en/server-system-variables.html#sysvar_max_heap_table_size

Note: the size of the temporary memory table created by the system is the minimum value of the preceding two configuration values.

Table Design principles

The use of temporary tables generally means that the performance is relatively low, especially when using disk temporary tables, which slows down the performance. Therefore, we should try to avoid the use of temporary tables in practical applications. Common methods to avoid temporary tables include:

1) create an index: create an index on the column 'Order by' or 'group;
2) split a long column: generally, TEXT, BLOB, strings larger than 512 bytes are basically used to display information but not for query conditions. Therefore, when designing a table, these columns should be independent to another table.

SQL optimization

If the table design has been determined and modification is difficult, you can optimize the SQL statement to reduce the size of the temporary table to improve the SQL execution efficiency.

Common methods to optimize SQL statements are as follows:

1) split SQL statements

Temporary tables are mainly used for sorting and grouping. Many businesses require sorting and then retrieve detailed paging data. In this case, you can split the sorting and detailed data into different SQL statements, this method is used in our case to reduce the size of temporary tables during sorting or grouping and improve the efficiency of sorting and grouping.

2) optimize services, remove sorting groups, and other operations

Sometimes businesses do not need to be sorted or grouped, but simply sorted for good looks or reading convenience, such as data export and data query operations, in this case, removing sorting and grouping does not have much impact on the business.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.