Optimize temporary table usage, SQL statement performance 100 times Times higher

Source: Internet
Author: User
Tags dba index sort

"Problem phenomenon"

A slow query was found on the online MySQL database, and the DBA observed that the server IO soared at query time, with an IO occupancy rate of 100% and execution time up to about 7s .
The SQL statements are as follows:
SELECT DISTINCT g.*, cp.name as Cp_name, c.name as Category_name, t.name as type_name from gm_game G left JOIN gm_cp CP on cp.id = g.cp_id and cp.deleted = 0 left join gm_category c in c.id = g.category_id and c.deleted = 0 left join gm_type t on T.id = g.type_id and t.deleted = 0 WHERE g.deleted = 0 ORDER by g.modify_time DESC LIMIT 20;

"Problem analysis"

Use explain to view the execution plan with the following results:

The problem with this SQL statement is actually quite obvious:
A large amount of data was queried (including the number of data bars, and g.*), and then the temporary table order by was used, but only 20 data were returned.
The high IO that the DBA observes is because the SQL statement generates a huge temporary table, the memory doesn't fit, and then it's all copied to disk, causing Io to soar.

"Optimization scheme"

The overall idea of optimization is to split the SQL, separating the sort operation from the operation of querying all information.
The first statement: query matching data, only need to query g.id can
SELECT DISTINCT g.id from gm_game G left join gm_cp cp in cp.id = g.cp_id and cp.deleted = 0 left join gm_category C on C . id = g.category_id and c.deleted = 0 left JOIN gm_type t in t.id = g.type_id and t.deleted = 0 WHERE g.deleted = 0 ORDER by G.modify_time DESC LIMIT 20;

The second statement: query for detailed data that matches the criteria, and use the in operation to stitch the results of the first SQL to the second SQL
SELECT DISTINCT g.*, cp.name as cp_name,c.name as category_name,t.name as type_name from gm_game G left JOIN gm_cp C P on cp.id = g.cp_id and cp.deleted = 0 left join gm_category c in c.id = g.category_id and c.deleted = 0 left join gm_type t on t . id = g.type_id and t.deleted = 0 WHERE g.deleted = 0 and G.id in (...)...... ...) ORDER by G.modify_time DESC;

"measured effect"

Test on the SATA machine, optimize before 50s, optimize the first 0.3s, the second 0.1s, the optimized execution speed is 100 times times more, Io from 100% to less than 1%
Test on SSD machine, optimize before 7s, optimize the first 0.3s, second 0.1s, optimized execution speed is 10 times times more, Io from 100% to less than 1%
As you can see, pre-optimization disk IO is a performance bottleneck, the SSD is significantly faster than SATA, the disk is no longer a bottleneck after optimization, and SSD and SATA performance is no different.

"Theoretical Analysis"

MySQL may use temporary tables when executing SQL queries, and in general, temporary tables mean lower performance.

    • Temporary table storage

The MySQL temp table is divided into "Memory temp table" and "Disk temp table", in which the memory temp table uses MySQL storage engine, the disk temp table uses MySQL's MyISAM storage engine;
In general, MySQL will first create a temporary memory table, but after the memory temp table exceeds the configured value, MySQL will export the memory temporary table to the disk temporary table;
Linux platform default is/tmp directory,/tmp directory small system to pay attention to.

    • Scenes that use temporary tables

1) The ORDER by clause differs from the GROUP BY clause, for example: Ordery by Price GROUP by name;

2) in a join query, ORDER by or group by uses a column that is not the first table, for example: SELECT * from TableA, TableB order by Tablea.price GROUP by Tableb.name

3) The DISTINCT keyword Ordery by DISTINCT (price) is used in ORDER by

4) The Sql_small_result keyword specified in the SELECT statement Sql_small_result means to tell MySQL that the results will be small, use the memory temp table directly, and do not need to use an index sort Sql_small_ Result must be used with group by, distinct, or distinctrow in general, we do not need to use this option to let MySQL server choose.

    • Scenes that use disk staging tables directly

1) The table contains text or BLOB columns;
2) The GROUP by or DISTINCT clause contains a column of length greater than 512 bytes;
3) When using union or union ALL, the SELECT clause contains a column greater than 512 bytes;

    • Temporary table-related configuration

Tmp_table_size: Specifies the maximum size of the memory temporary table created by the system; Http://dev.mysql.com/doc/refman/5.1/en/server-system-variables.html#sysvar_tmp_table_size

Max_heap_table_size: Specifies the maximum size of the memory table created by the user; Http://dev.mysql.com/doc/refman/5.1/en/server-system-variables.html#sysvar _max_heap_table_size

Note: The final system-created memory temp table size is the minimum value for the above two configuration values.

    • Design principles for tables

The use of temporary tables generally means that performance is relatively low, especially with the use of disk temporary tables, performance is slower, so we should try to avoid the use of temporary tables in the actual application. The common ways to avoid temporary tables are:
1) CREATE INDEX: Create an index on the column of order by or group by;
2) split a very long column: In general, TEXT, blobs, strings larger than 512 bytes, are basically intended to display information, and are not used for query criteria, so the table design should be independent of these columns to another table.

    • SQL optimization

If the design of the table has been determined and the modifications are difficult, you can also improve the efficiency of SQL execution by optimizing the SQL statement to reduce the size of the temporary table.
The common methods for optimizing SQL statements are as follows:
1) Splitting SQL statements
Temporary tables are primarily used for sorting and grouping, and many businesses require sorting and then fetching detailed paging data, in which case the sorting and fetching details can be split into different SQL to reduce the size of the temporary table when sorting or grouping, improve the efficiency of sorting and grouping, and this is how we use this approach.
2) Optimize business, remove sorting and grouping operations
Sometimes the business does not need to sort or group, just for the sake of good-looking or easy to read and sorted, such as data export, data query and other operations, in this case, the removal of sorting and grouping does not have much impact on the business.

    • How can I tell if a temporary table is used?

Using explain to view the execution plan, extra column sees the using temporary means that the temporary table is used.

For more information, please refer to MySQL official manual: http://dev.mysql.com/doc/refman/5.1/en/internal-temporary-tables.html

Articles you may be interested in
    • MySQL Bulk SQL Insert performance optimizations
    • MySQL InnoDB log mechanism in-depth analysis
    • MySQL InnoDB Database Performance Practice--varchar vs CHAR

from:http://tech.uc.cn/?p=2218

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.