Summary of MySQL query optimization method

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Common query optimization

1:max () Optimization: Adding indexes to the corresponding columns
2:count () Optimization: COUNT (*) calculates the number of containing null records, and COUNT (Field_name) contains only the number of Nulls (this is also the reason why the results of the two count methods are inconsistent), and count () as much as possible with the latter. COUNT (null) returns 0, which does not record the number of NULL records
3: Subquery optimization ===== (instead) join query (if 1 to many relationships, note duplicate records)
4:group by optimization If a subquery is included, use the Where Condition and group by filter in the subquery to avoid using group by at the outermost edge of a complex query (if the outermost use uses a temporary table)
5:order By, limit optimization:

Mode 1: Try to use the primary key or indexed column order by;
Mode 2: Use the self-added field: Record the primary key that was last returned or the self-add column (this field cannot have null values. Otherwise, there will be a problem of insufficient number of pages, the solution is to add additional index_id, and the index, filtering first with greater than the previous primary key value and less than the previous primary key value + The number of pages per page, filter the field, and then order by and limit
PS: Composite Index valid conditions:

Filter in 1:where condition (leftmost prefix)
2: When sorting: Index fields have a reverse when the time cannot be used
3: When a column has a range query, the column to the right of the field can not use the index
The idea of optimization is to try to avoid scanning too many records.

Principles for creating indexes:

1:where, order BY, group BY, field in clause
2: The smaller the index field, the better.
3: Combine the high degree of separation in the index of the field before

Table-Level optimization

1: The paradigm of the table optimization
2: Appropriate increase or decrease some redundancy, do inverse paradigm optimization (space in exchange for time)
3: Table columns with a very high number of times using vertical split
Principle:
1: Put the individual fields that are not commonly used in a table
2: Store large segments in a single table
3: Put together the fields that are often used together
4: The table's data volume is very large when the use of horizontal split
Method:
1: According to a certain field of the hash budget, if you want to split into 5 tables, the way to take the remainder of the 0-4, the table is guaranteed to the corresponding table
2: Save the data to a different table for different Hashid
Problem:
1: Issues across partitioned queries
2: Statistics and background report operation

(The foreground uses the questionnaire query, the background uses the summary table query to do the summary report operation).

first, through query buffer to improve query speed

In general, when we use SQL statements to query, the database server executes the SQL statement each time it receives a client-sent SQL. But when you receive exactly the same SQL statement within a certain interval (for example, within 1 minutes), you execute it as well. Although this can guarantee the real-time data, but most of the time, the data does not require full real-time, that is, can have a certain delay. If so, executing exactly the same SQL in a short time can be a bit of a loss.
Luckily MySQL provides us with the ability to query buffering (only use query buffering in MySQL 4.0.1 and above). We can improve query performance to some extent through query buffering.

1, we can set the query buffer by the My.ini file in the MySQL installation directory:

The setting is also very simple, just set the Query_cache_type to 1. When this property is set, MySQL queries its buffer to see if the same SELECT statement has been executed, and if so, and if the execution result does not expire, the query results are returned to the client directly before executing any SELECT statement. However, when writing SQL statements, note that the MySQL query buffer is case-sensitive. The following two SELECT statements are as follows:

SELECT * from TABLE1
SELECT * from TABLE1
The two SQL statements above are a completely different select for query buffering. and query buffering does not automatically process spaces, so when writing SQL statements, you should minimize the use of space, especially in the SQL first and end of the space (because the query buffer does not automatically intercept the head and tail spaces).

2. Temporarily turn off query buffering method:

While no query buffering is set, there may sometimes be performance losses, but there are some SQL statements that need to be queried in real time or infrequently (perhaps one or two times a day). That would require a buffer to be turned off. Of course, this can be done by setting the value of Query_cache_type to turn off query buffering, but this will permanently shut down the query buffer.
A way to temporarily turn off query buffering is provided in MySQL 5.0: Sql_no_cache.

SELECT Sql_no_cache field1, field2 from TABLE1
The above SQL statement is Sql_no_cache, so the server does not look in the buffer, regardless of whether the SQL statement is executed, and executes it every time.

3, temporarily open the query buffer method:

We can also set the Query_cache_type in My.ini to 2 so that query buffering is used only when Sql_cache is used.

SELECT Sql_calhe * from TABLE1

The automatic optimization of MySQL query

Indexes are very important for databases. You can use indexes to improve performance when querying. But sometimes using indexes can degrade performance. We can look at the sales table as follows:

CreateTable SALES
(
ID INT (Ten) UNSIGNED Notnull auto_increment,
NAME VARCHAR (Notnull),
Price Floatnotnull,
Sale_count Intnotnull,
Sale_date DATE Notnull,
PRIMARYKEY (ID),
INDEX (NAME),
INDEX (Sale_date)
)
Let's say we have millions of data in this table, and we're looking for the average price in 2004 and 2005 for items with commodity number 1000. We can write the following SQL statement:

SELECT AVG (price) from SALES
WHERE id=1000 and Sale_date BETWEEN ' 2004-01-01 ' and ' 2005-12-31 ';
If the number of this product is very large, almost accounted for the sales table 50% or more of the record. Then using the index on the Sale_date field to calculate the average is a bit slow. Because if you use an index, you have to sort the index. When the records that meet the criteria are very long (such as 50% or more of the records in the entire table), the speed slows down so that the entire table is scanned. As a result, MySQL automatically determines whether to use the index to query automatically based on the proportion of data that satisfies the criteria for the entire table.

For MySQL, the above query results for the entire table record proportion is about 30% when the index is not used, this ratio is the MySQL developers based on their experience. However, the actual scaling values vary according to the database engine being used.

Iii. Indexing based sorting

One of the weaknesses of MySQL is its sort. Although MySQL can query about 15,000 records in 1 seconds, MySQL has only one index to use when querying. Therefore, if the where condition already occupies the index, then the index is not used in the sort, which will greatly reduce the speed of the query. We can look at the following SQL statements:

Select*from SALES WHERE NAME = ' name ' by Sale_date DESC;
The index on the name field is already used in the WHERE clause of the SQL above, so the index is no longer used when sorting sale_date. To solve this problem, we can set up a composite index on the sales table:

Altertable SALES dropindex name, Addindex (name, sale_date)
This will increase the speed of the first mate when using the above SELECT statement to query. Note, however, that when you use this method, make sure that there are no sort fields in the Where clause, and in the example you cannot query with sale_date, otherwise the query slows down, although the sort is fast, but there is no separate index on the Sale_date field.

Select*from SALES WHERE NAME = ' name1 ' and name = ' Name2 '
The above query statement looks for records where name is equal to name1 and equal to name2. Obviously, this is an unreachable query, where the condition must be false. MySQL does not execute the SQL statement until it executes the SQL statement by parsing whether the where condition is an unreachable query. To verify this. We first test the following SQL using explain:

EXPLAIN select*from SALES WHERE NAME = ' name1 '
The above query is a normal query and we can see that the table item in the execution information data returned using explain is sales. This means that MySQL is operating on sales. Then look at the following statement:

EXPLAIN select*from SALES WHERE NAME = ' name1 ' and name = ' Name2 '
As we can see, the table item is empty, which means that MySQL does not operate on the sales table.

Iv. using a variety of query options to improve performance

In addition to the normal use of the SELECT statement, MySQL provides us with many options to enhance query performance. As described above, the Sql_no_cache and Sql_cache for controlling query buffering are two of these options. In this section, I'll introduce a few common query options.

1. Straight_join: Force Connection Order

When we join two or more tables to query, we do not care about which table MySQL first connects to, and which table to connect to. And this is all by the internal MySQL through a series of calculation, evaluation, the final result of a connection order. As in the following SQL statements, TABLE1 and TABLE2 are not necessarily who connect who:

SELECT TABLE1. FIELD1, TABLE2. FIELD2 from TABLE1, TABLE2 where ...
If developers need to artificially intervene in the order of connections, they have to use the Straight_join keyword, such as the following SQL statement:

SELECT TABLE1. FIELD1, TABLE2. FIELD2 from TABLE1 straight_join TABLE2 WHERE ...
From the above SQL statement, through Straight_join forced MySQL to TABLE1, TABLE2 in order to connect the table. If you think the connection is more efficient in your own order than the one recommended by MySQL, you can determine the connection order by Straight_join.

2, Intervention index use, improve performance

The use of indexes has been mentioned above. In general, MySQL will decide for itself whether to use an index and which index to use when querying.
But in some special cases, we want MySQL to use only one or a few indexes, or you don't want to use an index. This requires some query options that use MySQL's control index.

(1) Limit the scope of use of the index:

Sometimes we set up many indexes in our data table, and when MySQL chooses the index, these indexes are in the range of consideration. But sometimes we want MySQL to consider only a few indexes, not all the indexes, which requires using the use index to set up the query.

Select*from TABLE1 Useindex (FIELD1, FIELD2) ...
As you can see from the above SQL statements, regardless of how many indexes have been established in TABLE1, MySQL only considers indexes established on FIELD1 and FIELD2 when selecting Indexes.

(2) Limit the scope of not using indexes:

If we want to consider a lot of indexes, but not the use of the index is very small, you can use ignore index for reverse selection. In the above example, you select the index to be considered, and using ignore index is the choice of indexes that are not considered.

Select*from TABLE1 IGNORE INDEX (FIELD1, FIELD2) ...
In the above SQL statement, only indexes on FIELD1 and FIELD2 are not used in the TABLE1 table.

(3) Forcing the use of an index:

The two examples above provide a choice for MySQL, which means MySQL does not necessarily use these indexes. And sometimes we want MySQL to have to use an index (because MySQL can only use one index when querying, so it can only force MySQL to use an index). This requires the use of force index to complete this function.

Select*from TABLE1 FORCE INDEX (FIELD1) ...
The above SQL statement uses only the indexes built on the FIELD1, not the indexes on other fields.

3. Use temporary tables to provide query performance

When the data in the result set of our query is relatively long, you can force the result set to be placed in a temporary table through the Sql_buffer_result option so that you can quickly release the MySQL table lock (so that other SQL statements can query the records), and can provide large recordsets for clients over a long period of time.

SELECT Sql_buffer_result * from TABLE1 WHERE ...
Similar to the Sql_buffer_result option is Sql_big_result, which is typically used for grouping or distinct keywords, which informs MySQL that, if necessary, the query results are placed in a temporary table or even sorted in a temporary table.

SELECT Sql_buffer_result FIELD1, COUNT (*) from TABLE1 GROUPBY FIELD1

v. MySQL query optimization: Using indexes

MySQL has several ways to use indexes:

· As noted above, indexes are used to increase the search speed of data rows that match the where condition or when the join operation matches other tables.

· For queries that use the min () or Max () function, the smallest or largest value in the indexed data column can be found quickly without checking each row of data.

· MySQL uses indexes to quickly perform the sorting and grouping operations of order by and GROUP by statements.

· Sometimes MySQL uses the index to read all the information the query gets. Suppose you select the indexed column of values in the MyISAM table, then you do not need to select a different data column from the datasheet. In this case, MySQL reads the index value from the index file, and it gets the same value as the read data file. It is not necessary to read the same value two times, so there is no need to consider the data file.

Index creation Rules:
1, the table's primary key, the foreign key must have the index;
2, the data volume of more than 300 of the table should be indexed;
3, often with other tables to connect the table, in the connection field should be indexed;
4, often appear in the WHERE clause in the field, especially large table fields, should be indexed;
5, the index should be built on the field of high selectivity;
6, the index should be built on the small section, for large text fields and even long fields, do not build indexes;
7, the establishment of composite index needs careful analysis; Consider using a single field index instead:
A, the correct selection of the composite index of the main column field, is generally a better choice of fields;
B, how often do several fields of a composite index appear in the WHERE clause at the same time? are word-field queries very small or not? If it is, you can build a composite index, otherwise consider a single field index;
C, if the fields contained in the composite index often appear separately in the WHERE clause, they are decomposed into multiple single field indexes;
D, if the composite index contains more than 3 fields, consider the need to reduce the compound field carefully;
E, if both the Single field index, and the composite index on these fields, you can generally delete the composite index;
8, frequent data operation of the table, do not establish too many indexes;
9, delete the useless index, to avoid negative impact on the implementation plan;

These are some common criteria for establishing an index. Word, the establishment of the index must be prudent, the need for each index should be carefully analyzed, to establish the basis.
Because too many indexes and inadequate, incorrect indexes are not good for performance: Each index created on a table increases storage overhead, and indexing increases processing overhead for inserts, deletes, and updates.
In addition, too many composite indexes, in the case of a single field index, generally have no value; Conversely, it also reduces the performance of data additions and deletions, especially for frequently updated tables.

An in-depth analysis of query optimization explain

Here is an example to illustrate the use of the explain.
First come to a table:

CREATE TABLE IF not EXISTS ' article ' (' id ' int (a) unsigned not NULL auto_increment,
' author_id ' int (a) unsigned not NULL,
' category_id ' int (a) unsigned not NULL,
' Views ' int (a) unsigned not NULL,
' Comments ' int (a) unsigned not NULL,
' title ' varbinary (255) not NULL,
' Content ' text not NULL,
PRIMARY KEY (' id ')
);

Insert a few more data:

INSERT into ' article '
(' author_id ', ' category_id ', ' views ', ' comments ', ' title ', ' content ') VALUES
(1, 1, 1, 1, ' 1 ', ' 1 '),
(2, 2, 2, 2, ' 2 ', ' 2 '),
(1, 1, 3, 3, ' 3 ', ' 3 ');

Demand:
Query category_id is 1 and comments is greater than 1, the views most article_id.
Check it out first:

EXPLAIN
SELECT author_id
From ' article '
WHERE category_id = 1 and comments > 1
Order by Views DESC
LIMIT 1\g

Look at some of the output results:

1. Row ***************************
Id:1
Select_type:simple
Table:article
Type:all
Possible_keys:null
Key:null
Key_len:null
Ref:null
Rows:3
Extra:using where; Using Filesort
1 row in Set (0.00 sec)

Obviously, type is all, which is the worst case. A Using Filesort is also present in the Extra, which is also the worst case. Optimization is a must.
Well, the simplest solution would be to add an index. OK, let's have a try. The Category_id,comments,views three fields are used after the query's condition, where it is. Then a federated index is the easiest.

ALTER TABLE ' article ' ADD INDEX x (' category_id ', ' comments ', ' views ');

The result was a certain improvement, but it was still bad:

1. Row ***************************
Id:1
Select_type:simple
Table:article
Type:range
Possible_keys:x
Key:x
Key_len:8
Ref:null
Rows:1
Extra:using where; Using Filesort
1 row in Set (0.00 sec)

Type becomes range, which can be tolerated. However, using Filesort in extra is still unacceptable. But we have indexed, why not? This is because according to the working principle of the Btree index, sort the category_id first, and then sort the comments if you encounter the same category_id, and then sort the views if you encounter the same comments. When the comments field is in the middle position in the federated index, because the comments > 1 condition is a range value (called range), MySQL cannot use the index to retrieve the following views section, which means the index after the range type query field is invalid.
Then we need to discard comments and delete the old index:

DROP INDEX x on article;

Then create a new index:

ALTER TABLE ' article ' ADD INDEX y (' category_id ', ' views ');

Then run the query again:

1. Row ***************************
Id:1
Select_type:simple
Table:article
Type:ref
Possible_keys:y
Key:y
Key_len:4
Ref:const
Rows:1
Extra:using where
1 row in Set (0.00 sec)

As you can see, the type becomes the Using filesort in Ref,extra and disappears, and the result is ideal.
Let's look at an example of a multiple-table query.
First, define 3 table class and room.

CREATE TABLE IF not EXISTS ' class ' (
' ID ' int (a) unsigned not NULL auto_increment,
' Card ' int (a) unsigned not NULL,
PRIMARY KEY (' id ')
);
CREATE TABLE IF not EXISTS ' book ' (
' BookID ' int (a) unsigned not NULL auto_increment,
' Card ' int (a) unsigned not NULL,
PRIMARY KEY (' BookID ')
);
CREATE TABLE IF not EXISTS ' phone ' (
' Phoneid ' int (a) unsigned not NULL auto_increment,
' Card ' int (a) unsigned not NULL,
PRIMARY KEY (' Phoneid ')
) engine = InnoDB;

Then insert a large amount of data separately. PHP script to insert data:

<?php
$link = mysql_connect ("localhost", "root", "870516");
mysql_select_db ("Test", $link);
For ($i =0 $i <10000; $i + +)
{
$j = rand (1,20);
$sql = "INSERT into class (card) values ({$j})";
mysql_query ($sql);
}
For ($i =0 $i <10000; $i + +)
{
$j = rand (1,20);
$sql = "INSERT into book (card) VALUES ({$j})";
mysql_query ($sql);
}
For ($i =0 $i <10000; $i + +)
{
$j = rand (1,20);
$sql = "INSERT into the phone (card) values ({$j})";
mysql_query ($sql);
}
mysql_query ("COMMIT");
?>

Then look at a left-join query:

Explain select * FROM class LEFT join book on class.card = Book.card\g

The results of the analysis are:

Obviously the second all is for us to optimize.
Try setting up an index:

ALTER TABLE ' book ' ADD INDEX y (' card ');

You can see that the type of the second row becomes ref,rows and 1741*18, and the optimization is more obvious. This is determined by the left connection attribute. The left JOIN condition is used to determine how to search for rows from the right table, there must be a left-hand side, so the right side is our key point and must be indexed.
To delete an old index:

DROP INDEX y on book;

Create a new index.

ALTER TABLE ' class ' ADD INDEX x (' card ');

Results

Basically no change.
Then look at a right connection query:

Explain select * FROM class right join book on class.card = Book.card;

The results of the analysis are:

1. Row ***************************

Id:1

Select_type:simple

Table:book

Type:all

Possible_keys:null

Key:null

Key_len:null

Ref:null

rows:20000

Extra:

2. Row ***************************

Id:1

Select_type:simple

Table:class

Type:ref

Possible_keys:x

Key:x

Key_len:4

Ref:test.book.card

rows:1000

Extra:

2 rows in Set (0.00 sec)

Optimization is more obvious. This is because the right JOIN condition is used to determine how to search for rows from the left table, which must be on the right-hand side, so the left side is our key point and must be indexed.
To delete an old index:

DROP INDEX X on class;

Create a new index.

ALTER TABLE ' book ' ADD INDEX y (' card ');

Results

Basically no change.
Finally, let's look at the inner join:

Explain SELECT * FROM class INNER join book on class.card = Book.card;

Results:

To delete an old index:

DROP INDEX y on book;

Results

Create a new index.

ALTER TABLE ' class ' ADD INDEX x (' card ');

Results

To sum up, inner join and left join, all need to optimize the right table. And right join needs to optimize the left table.
Let's take another look at the three-table query example
To add a new index:

ALTER TABLE ' phone ' ADD INDEX z (' card ');
ALTER TABLE ' book ' ADD INDEX y (' card ');

Explain select * FROM class LEFT join book on Class.card=book.card left join phone on book.card = Phone.card;

1. Row ***************************

Id:1

Select_type:simple

Table:class

Type:all

Possible_keys:null

Key:null

Key_len:null

Ref:null

rows:20000

Extra:

2. Row ***************************

Id:1

Select_type:simple

Table:book

Type:ref

Possible_keys:y

Key:y

Key_len:4

Ref:test.class.card

rows:1000

Extra:

3. Row ***************************

Id:1

Select_type:simple

Table:phone

Type:ref

Possible_keys:z

Key:z

Key_len:4

Ref:test.book.card

rows:260

Extra:using Index

3 Rows in Set (0.00 sec)

The type in the 2 rows is ref and the total rows optimization is good and works fine.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Summary of MySQL query optimization method

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Summary of MySQL query optimization method

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support