MySQL query performance optimization

Source: Internet
Author: User

Abstract: This article describes how to optimize SQL queries. You can manually use the EXPLAIN statement to check the efficiency of SQL queries. In addition, some principles for optimizing SQL statements are described, mainly about how to optimize SQL statements when retrieving records and loading data.

Use the EXPLAIN statement to check SQL statements

When you put the keyword "EXPLAIN" in front of a SELECT statement, MySQL explains how it will process the SELECT statement and provides information about how the table is joined and in what order.

With the help of EXPLAIN, you can know when you must add an index to the table to obtain a faster SELECT statement that uses the index to locate the record.

EXPLAIN tbl_name

Or explain select select_options
EXPLAIN tbl_name
YesDESCRIBE tbl_name or show columns from tbl_name.

The output from EXPLAIN includes the following columns:

· Table
The table referenced by the output row.

· Type
Join type. Various types of information are provided below.
Different join types are listed below in the best to worst type order:
System const eq_ref ref range index ALL possible_keys

· Key
The key column displays the keys actually determined by MySQL. If no index is selected, the key is NULL.

· Key_len
The key_len column displays the key length determined by MySQL. If the key is NULL, the length is NULL. Note that MySQL will actually use multiple key values.

· Ref
The column ref shows which column or constant is used together with the key to select rows from the table.

· Rows
The rows column shows the number of rows that MySQL believes must be checked for query.

· Extra
If the Extra column contains the text Only index, this means that the information is Only retrieved from the index tree. Generally, this is faster than scanning the entire table. If the Extra column contains the text where used, it means that a WHERE clause will be used to limit which rows match or send to the customer in the next table.
By multiplying all the values of the rows output by EXPLAIN, you can get a prompt about how good a join is. This should roughly tell you how many lines MySQL must check to execute the query.

For example, the following full connection:

Mysql> explain select student. name From student, pet
-> WHERE student. name = pet. owner;

The conclusion is:
+ --------- + ------ + ------------- + ------ + --------- + ------ + ------------ +
| Table | type | possible_keys | key | key_len | ref | rows | Extra |
+ --------- + ------ + ------------- + ------ + --------- + ------ + ------------ +
| Student | ALL | NULL | 13 |
| Pet | ALL | NULL | 9 | where used |
+ --------- + ------ + ------------- + ------ + --------- + ------ + ------------ +

SELECT query speed

In general, when you want to make a slow SELECT... WHERE faster, the first thing to check is whether you can add an index. All references between different tables must be indexed. You can use EXPLAIN to determine which index is used for a SELECT statement.

Some general suggestions:
· To help MySQL optimize queries, run myisamchk -- analyze on a table after it has loaded the relevant data. This indicates the average number of rows with the same value for each updated value (of course, this is always 1 for a unique index .)
· To sort an index and data according to an index, use myisamchk -- sort-index -- sort-records = 1 (if you want to sort it on index 1 ). If you have a unique index, you want to read all records in the order of the index location, which is a good way to make it faster. However, note that this sorting is not best written, and it will take a long time for a large table!

How does MySQL optimize the WHERE clause?

The where optimization is put in the SELECT statement, because they are mainly used there, but the same optimization is used for DELETE and UPDATE statements.

Note that this section is incomplete. MySQL has indeed made many optimizations and we don't have time to record them all.

Some optimizations implemented by MySQL are listed below:

1. Delete unnecessary parentheses:
(A AND B) AND c OR (a AND B) AND (c AND d ))))
-> (A AND B AND c) OR (a AND B AND c AND d)

2. Constant call:
(->B> 5 AND B = c AND a = 5

3. Delete the constant condition (required for constant transfer ):
(B> = 5 and B = 5) OR (B = 6 AND 5 = 5) OR (B = 7 AND 5 = 6)
-> B = 5 or B = 6

4. The constant expression used by the index is calculated only once.

5. No where count (*) in a single table is directly retrieved from the table. This is also true for any not null expression when only one table is used.

6. Early Detection of invalid constant expressions. MySQL quickly detects that some SELECT statements are impossible and no rows are returned.

7. If you do not use group by or grouping functions (COUNT (), MIN ()......), Merge HAVING and WHERE.

8. For each subjoin, construct a simpler WHERE to get a faster WHERE computation and skip the record as soon as possible.

9. All constant tables are read before any other tables in the query. A constant table is:

· An empty table or a table with one row.

· For a table used together with a UNIQUE index or a WHERE clause of a primary key, all the indexes here use a constant expression and the index is defined as not null.

All of the following tables are used as constant tables.

mysql> SELECT * FROM t WHERE primary_key=1;mysql> SELECT * FROM t1,t2WHERE t1.primary_key=1 AND t2.primary_key=t1.id;

10. The best join combination for the joined table is found by trying all possibilities :(. If all columns in order by and group by come from the same table, the table is selected first when the table is clean.

11. If there is an order by clause and a different group by clause, or if order by or group by contains columns not from the first table in the join queue, create a temporary table.

12. If you use SQL _SMALL_RESULT, MySQL uses a table in the memory.

13. Because DISTINCT is converted to a group by clause on all columns, combining DISTINCT with order by also requires a temporary table in many cases.

14. The index of each table is queried and indexes that span less than 30% rows are used. If such an index cannot be found, use a quick table scan.

15. In some cases, MySQL can read data from indexes, or even not consult data files. If the index uses numbers for all columns, only the index tree is used to answer queries.

16. Before each record is output, the rows that do not match the HAVING clause are skipped.

Below are some examples of quick queries

mysql> SELECT COUNT(*) FROM tbl_name;mysql> SELECT MIN(key_part1),MAX(key_part1) FROM tbl_name;mysql> SELECT MAX(key_part2) FROM tbl_name           WHERE key_part_1=constant;mysql> SELECT ... FROM tbl_name           ORDER BY key_part1,key_part2,... LIMIT 10;mysql> SELECT ... FROM tbl_name           ORDER BY key_part1 DESC,key_part2 DESC,... LIMIT 10;

The following queries can only be solved using the index tree (assuming that the index column is Numeric ):

mysql> SELECT key_part1,key_part2 FROM tbl_name WHERE key_part1=val;mysql> SELECT COUNT(*) FROM tbl_name           WHERE key_part1=val1 AND key_part2=val2;mysql> SELECT key_part2 FROM tbl_name GROUP BY key_part1;

The following query uses an index for sorting and does not use another sorting:

mysql> SELECT ... FROM tbl_name ORDER BY key_part1,key_part2,...mysql> SELECT ... FROM tbl_name ORDER BY key_part1 DESC,key_part2 DESC,...

How MySQL optimizes LEFT JOIN

In MySQL, the implementation of a left join B is as follows:

1. Table B is set to depend on Table.

2. Table A is set to depend on all tables used in the left join condition (except B ).

3. All left join conditions are moved to the WHERE clause.

4. Perform all standard join optimization. Except a table, it is always read after all the tables it depends on. If there is a circular dependency, MySQL will issue an error.

5. Perform all standard WHERE optimizations.

6. If one row in A matches the WHERE clause, but no row in B matches the left join condition, A row with all columns set to NULL is generated in B.

7. If you use left join to find rows that do NOT exist in some tables and you have the following test in the WHERE section: column_name is null, column_name IS declared as not null column here, after MySQL finds a row that matches the left join condition, it stops searching for more rows (for a specific key combination ).

How MySQL optimizes LIMIT

In some cases, when you use LIMIT # instead of HAVING, MySQL will process the query in different ways.

1. If you use LIMIT to select only some rows, MySQL will use indexes in some cases when it prefers full table scanning.

2. If you use LIMIT # And order by, once MySQL finds the first # Row, it will end sorting rather than sorting the entire table.

3. When LIMIT # and DISTINCT are combined, MySQL stops once it finds a unique row.

4. In some cases, a group by can be solved BY reading the key in sequence (or sorting on the Key), and then calculating the summary until the key value changes. In this case, LIMIT # does not calculate any unnecessary GROUP.

5. As long as MySQL has sent the first line to the customer, it will discard the query.

6. LIMIT 0 always returns an empty set quickly. This is useful for checking the query and obtaining the column type of the result column.

7. Use LIMIT # to calculate the size of a temporary table.

Record reprinting and modification speed

Most of the time, you are concerned about optimizing SELECT queries because they are the most common queries and it is not always straightforward to determine how to optimize them. Relatively speaking, it is straightforward to load data into the database. However, there are also policies to improve the efficiency of data loading operations. The basic principles are as follows:

· Batch loading is faster than single-row loading, because after each record is loaded, you do not need to refresh the index cache; you can refresh the index only after batch record loading.

· Loading a table with no indexes is faster than loading the table with indexes. If there is an index, you must not only add records to the data file, but also modify each index to reflect the new records added.

· Shorter SQL statements are faster because they involve less analysis on the server side and are sent from the client to the server over the network.

Some of these factors seem insignificant (especially the last one), but if you want to load a large amount of data, even a small factor will produce very different results.

INSERT query speed

The time for inserting a record is composed of the following:

· Connection: (3)

· Send a query to the server: (2)

· Analysis query: (2)

· Insert record: (1 x record size)

· Insert index: (1 x index)

· Close: (1)

The number here is a bit proportional to the overall time. This does not take into account the initial overhead of opening the table (it performs a query for each concurrent operation ).

The table size slows down index insertion at the speed of N log N (B tree.

Some methods to accelerate insertion:

· If you INSERT many rows from the same customer at the same time, use the INSERT Statement of multiple value tables. This is faster than using separate INSERT statements (several times in some cases ).

· If you INSERT many rows from different customers, you can use the insert delayed statement to get a higher speed.

· Note: If MyISAM is used, if no row is deleted in the table, the row can be inserted While SELECT: s is running.

· When loading a table from a text file, use load data infile. This is usually 20 times faster than using many INSERT statements.

· When a table has many indexes, it is possible to do more work to make load data infile faster. Use the following process:

1. Select create table to CREATE a TABLE. For example, use mysql or Perl-DBI.

2. Run the flush tables command or the shell command mysqladmin flush-tables.

3. Use myisamchk -- keys-used = 0-rq/path/to/db/tbl_name. This will delete the use of all indexes from the table.

4. Use load data infile to insert DATA into the table. This will not update any indexes, so it is very fast.

5. If you have myisampack and want to compress the table, run myisampack on it.

6. Use myisamchk-r-q/path/to/db/tbl_name to create an index. This will create an index tree in the memory before writing it to the disk, and it is faster, because it avoids a large number of disk seek. The Results Index Tree is also perfectly balanced.

7. Run the flush tables command or the shell command mysqladmin flush-tables.

This process will be constructed into a future MySQL version of load data infile.

· You can lock your table to accelerate insertion.

mysql> LOCK TABLES a WRITE;mysql> INSERT INTO a VALUES (1,23),(2,34),(4,33);mysql> INSERT INTO a VALUES (8,26),(6,29);mysql> UNLOCK TABLES;

The major speed difference is that the index buffer is only cleaned to the disk once, after all INSERT statements are completed. In general, there are different INSERT statements that result from index buffer cleaning. If you can use a single statement to insert all rows, locking is not required. Locking will also reduce the overall time for multi-connection testing, but the maximum wait time for some threads will rise (because they wait for the lock ). For example:

thread 1 does 1000 insertsthread 2, 3, and 4 does 1 insertthread 5 does 1000 inserts

If you do not use the lock, 2, 3, and 4 will be completed before 1 and 5. If you use locking, 2, 3, and 4 may not be completed before 1 or 5, but the overall time should be about 40% faster. Because INSERT, UPDATE, and DELETE operations are fast in MySQL, you will get better overall performance by locking more than five consecutive inserts or updates of a row. If you insert many rows, you can create a lock tables, and occasionally create an unlock tables (about every 1000 rows) to allow other threads to access the table. This will still result in good performance. Of course, load data infile is still faster for loading DATA.

To speed up load data infile and INSERT, expand the keyword buffer.

UPDATE query speed

The change query is optimized to a SELECT query with a write overhead. The write speed depends on the size of the updated data and the number of updated indexes.

Another way to make changes faster is to delay the changes and make many changes one by one row. If you lock the table, it is faster to make many changes in one row and one row at a time.

Note: a change to the dynamic record format is a long record, and the record may be cut. Therefore, if you do this frequently, optimize table is very important from time to time.

Speed of the DELETE Query

The time for deleting a record is exactly proportional to the number of indexes. To delete records more quickly, you can increase the index cache size.

Deleting all rows from a table is much larger than deleting a majority of rows.

Impact of indexes on effective data loading

If the table is indexed, you can use batch INSERT (load data or multiple rows of INSERT statements) to reduce the index overhead. This will minimize the impact of index updates, because indexes must be refreshed only when all rows are processed, rather than after each row is processed.

· If you need to load a large amount of data into a new table, you should create the table and load the data before creating the index. This is faster. One index creation (instead of one index per row) is faster.

· If you delete or disable indexes before loading, re-create or enable indexes after loading data may make loading faster.
· If you want to use a deletion or disabling policy for data loading, you must perform some experiments to check whether this is worthwhile (if you load a small amount of data into a large table, rebuilding and indexing may take longer than loading data ).

You can use drop index and create index to delete and recreate indexes.

Another option is to use myisamchk or isamchk to disable and enable indexes. This requires an account on the MySQL server host and the permission to write data to the table file. To disable table indexes, you can enter the corresponding database directory and execute one of the following commands:

shell>myisamchk --keys-used=0 tbl_nameshell>isamchk --keys-used=0 tbl_name 

Use myisamchk for MyISAM tables with. MYI extension and isamchk for ISAM tables with. ISM extension. After loading data into the table, activate the index as follows:

shell>myisamchk --recover --quick --keys-used=0 tbl_nameshell>isamchk --recover --quick --keys-used=0 tbl_name

N indicates the number of table indexes. Use the -- description option to call the corresponding utility to obtain this value:

shell>myisamchk --discription tbl_name$isamchk --discription tbl_name

If you decide to disable and activate indexes, use the table repair lock protocol described in Chapter 13th to prevent the server from changing the lock at the same time (although the table is not repaired at this time, but you need to modify it like the table repair process, so you need to use the same locking protocol ).

The above data loading principle is also applicable to fixed queries related to clients that require different operations. For example, you generally want to avoid running the SELECT query for a long time on frequently updated tables. Running the SELECT query for a long time will produce a lot of contention and reduce the performance of the write program. One possible solution is to store the records in a temporary table and add these records to the primary table on a regular basis. If you need to access the new record immediately, this is not a feasible method. However, you can use this method as long as you can not access them in a short period of time. Using temporary tables has two advantages. First, it reduces contention with the SELECT query statement on the master table, so the execution speed is faster. Second, the total time for Loading records from the temporary table to the master table is less than the total time for Loading records separately. The corresponding index cache only needs to be refreshed at the end of each batch load, instead of refreshing after each row is loaded.

An application in this Policy accesses the MySQL database on the Web page of the Web server. In this case, it is not guaranteed that the record has high permissions to access the master table immediately.

If the data is not exactly a single record inserted in an abnormal shutdown event, another policy to reduce index refresh is to use the DELAYED_KEY_WRITE table creation option of the MyISAM table (this may happen if MySQL is used for some data input work ). This option only occasionally refreshes the index cache, instead of refreshing it after each insert.

If you want to use delayed index refresh within the server range, you only need to use the -- delayed-key-write option to start mysqld. In this case, the index block write operation is delayed until the block must be refreshed to free up space for other index values, or after a flush-tables command is executed, or the index table is closed.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.