MySQL Query optimization

Last Update:2017-05-08 Source: Internet

Author: User

Tags mysql query mysql query optimization

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Introduction

A good Web application, the most important thing is to have excellent access performance. Database MySQL is an integral part of Web applications and an important part of determining its performance. So it's important to improve MySQL performance.

MySQL performance improvements can be divided into three parts, including hardware, network, software. One of the hardware, the network depends on the company's financial resources, the need for silver, here is not said. The software is subdivided into many different types, where we can improve performance by optimizing MySQL queries.

Recently read some books on query optimization, but also on the Internet to read some of the articles written by predecessors.

Here are some summary of the query optimization:

1. Comprehensive Inquiry Log

2. Slow query log

3. Binary Log

4. Process List

SHOW full processlist;

。。。

1, EXPLAIN {partitions| EXTENDED}

2, SHOW CREATE TABLE tab;

3, SHOW indexs from tab;

4, SHOW TABLE STATUS like ' tab ';

5. SHOW [global| SESSION] STATUS like ';

6. SHOW VARIABLES

。。。。

PS: I feel that there are no nutrients on it. Here is the real dry goods ha.

1, try to avoid full-text scan, add index to the corresponding field, apply index to query

2. Delete unused or duplicate indexes

3, query rewrite, equivalent conversion (predicate, subquery, connection query)

4. Delete content duplicate unnecessary statement, thin statement

5, the integration of repeated execution of the statement

6. Cache Query Results

1. Maintain the integrity of the data

2, improve the data query performance

3, improve the table connection operation (Jion)

4, the query results are sorted. No index will be sorted using the internal file sorting algorithm, the efficiency is slow

5. Simplified Aggregation data operation

1, the index needs to occupy a certain amount of storage space

2, data Insert, UPDATE, delete will be affected by the index, performance will be reduced. Because the data change index also needs to be updated

3, multiple indexes, optimizer needs time-consuming and excellent choice

1. When the data volume is large, use

2. When data is highly repetitive, do not use

3, the query takes out the data more than 20%, will use the full-text scan, does not have the index

Information Enquiry:

InnoDB, MyISAM in MySQL are all b-tree type indexes

B-tree contains: PRIMARY KEY, UNIQUE, INDEX, and Fulltext

The B-tree type index is not supported (that is, if the field uses the following symbol, the index will not be used):

, <, >=, <=, between,! =, <>,like '%** '

"Let's introduce the overlay index here"

Let me introduce myself in a way I understand. Overwriting an index is not as real as a primary key index or a unique index, it is simply a definition of applying certain scenarios to an index "another understanding: the columns of the query are indexed columns, so the columns are overwritten by the index." It can break through the traditional limitations, use the above operators, and still use the index to query.

Because the columns of the query are indexed columns, you do not need to read the rows, just read the column field data. "For example, if you look at a book, you need to find something, just the content appears in the table of contents, then you do not have to page over, directly in the table of contents to locate the first page to find"

How do I activate an overlay index? What is a specific scenario?

Index field, which appears in select.

There may also be other special scenarios for composite indexes. For example, for a three-column composite index, you can activate the overwrite index by simply appearing in the left-most column of the composite index at any one place in select, where, group by, and order by.

View:

Explain in extra shows that a using index indicates that the statement uses an overwrite index.

Conclusion:

It is not recommended to use Select*from when querying, you should write the required fields, and add the corresponding indexes to improve the query performance.

For the above operator results:

1, in the form of Select*from, where is primary key can kill "except like" (using the primary key to query); index is all.

2. Test with select field A From tab where field a "above operator", the result can still use index query. "Overwrite index adopted"

Other Index optimization methods:

1. Use index keyword as the condition of the connection

2, composite index use

3, indexed merge or and, the fields involved are merged into a composite index

4. Where, and group by involves the field plus index

For non-correlated subqueries in from, a pull-up subquery can be queried to the parent layer. In the Multi-table connection query Consider the connection cost and then select.

Query optimizer subqueries are typically nested in a way that executes a subquery once for each row in the parent query, which is executed many times. This type of execution is inefficient.

The sub-query translates into a connection query advantage:

1. Subqueries do not have to be executed many times

2. The optimizer can select different methods and connection sequences according to the information

3, sub-query connection conditions, filtering conditions into the parent query filter conditions, to improve efficiency.

Optimization:

Sub-query merging, if multiple subqueries, can merge as far as possible merge.

Sub-query expansion, that is, the pull-up becomes a multi-table query (always guaranteed equivalent change)

Attention:

Sub-query expansion can only expand simple queries, if the subquery contains aggregate functions, GROUP by, DISTINCT, you cannot pull up.

SELECT * FROM T1 (select*from tab where id>10) as T2 where t1.age>10 and t2.age<25;

Select*from T1,tab as T2 where t1.age>10 and t2.age<25 and t2.id>10;

Specific steps:

1, from and from to merge, modify the corresponding parameters

2. Where and where are merged with and connected

3. Modify the corresponding verb (in change =)

1, betweeen and rewrite for >=, <= and the like. Measured: 100,000 data, rewrite before and after time, 1.45s, 0.06s

2, in convert multiple or. When a field is indexed, two can be indexed, or the efficiency is better

3. Name like ' abc% ' rewritten into name>= ' abc ' and name< ' Abd ';

Note: Millions data test, the name does not index before the like is faster than the latter query, after adding the index to the field, the next fast point, the difference is not small, because both methods are used in the query index.

。。。。

1, the Where, having (no groupby and aggregation functions), join-on conditions can be combined as far as possible

2, remove unnecessary parentheses, reduce the syntax of the OR and and tree layers, reduce CPU consumption

3, constant pass. A=b and b=2 are converted to a=2 and b=2. Try not to use variable a=b or [email protected]

4. Eliminate useless SQL conditions

5, where equal to the right to try not to appear in the expression evaluation, where the field does not evaluate the expression, the use of functions

6, identity transformation, inequality transformation. Example: Test millions data a>b and b>10 into A>b and a>10 and b>10 optimization

Outgoing connection to inner connection

Advantages:

1, optimized processor processing external connection more than the internal connection step and time-consuming

2, after the external connection is eliminated, the optimizer chooses multiple table connection order to have more choice, can choose

3. The table with the most stringent filter conditions can be used as the appearance (the outer loop layer of the multilayer loop body is the most important in the connection order).

It can reduce unnecessary I/o overhead and speed up the execution of the algorithm.

On a.id=b.id the difference between the where a.id=b.id, the on table is connected, where the data is compared

Note: The precondition must be null to deny the result (i.e. condition limit not null data row, semantically is inner join)

Optimization principle:

Thin query, connection elimination, equivalent conversion, removal of redundant Table object connections

For example, the primary key/Unique key is used as the join condition, and the intermediate table column is only used as the equivalent condition, you can remove the intermediate table connection

1. The following will result in discarding index query, using full-text scan

1.1. Use! = or <> operator in the WHERE clause note: PRIMARY key support. Non-primary key not supported

1.2. Avoid using or

After testing, it is not necessary to use or must not use the index, most of the cases are useless to the index, but there are a few cases are used, so the specific situation of specific analysis.

Similar optimizations:

SELECT * from tab name= ' AA ' or name= ' BB ';

SELECT * from tab name= ' AA '

UNION ALL

SELECT * from tab name= ' BB ';

Measured:

1, 100,000 data test, without any index, the above query rate is one times faster than the following.

2, 300,000 data test, AA and BB are separate index cases, the following query rate is a little faster than or.

1.3. Avoid using not in

The index is not used in general, and the primary key field can be

1.4, where to avoid the use of NULL judgment

1.5, like can not be placed before the percent like '%.com '

Solve:

1, if must use the% front, and the data length is not big, for example the URL, may the data rollover to the database, then checks. Like REVERSE '%.com ';

2. Using the overlay index

1.6. If the index field is used as a condition, if it is a composite index, the field name of the leftmost prefix of the index should be used

2, replace the exists in

Select num from a where num in (select num from B)

Select num from a where exists (select 1 from b where num=a.num)

1 million data, filter 59,417 data time 6.65s, 4.18s. No other optimizations were made, just replacing the exists with in.

3, the field definition is a string, the query without quotation marks, will not be indexed, will be full-text scan.

"The following is excerpt in the middle of the night outrageous Bowen Http://www.cnblogs.com/lingiu/p/3414134.html, I did not carry out the corresponding test"

4. Try to use table variables instead of temporary tables

5. Avoid frequent creation and deletion of temporary tables to reduce system table resource consumption

6, if the temporary table is used, at the end of the stored procedure must explicitly delete all temporary tables, first TRUNCATE TABLE, and then drop table, so as to avoid the system table for a longer time lock

7, try to avoid using cursors, because cursors are inefficient, if the cursor operation of data more than 10,000 rows, then you should consider rewriting

8, Big Data volume, if the data volume is too large, should consider whether the corresponding demand is reasonable.

9, try to avoid large transaction operation, improve the system concurrency ability.

。。。。。

After these days to learn the data knocking code, learned that the MySQL query optimization is not a simple formula according to a certain rule can be achieved. Experiment is the only standard of test, after these days of testing, the conclusion is: MySQL query optimization is the general direction, but want to get a universal optimization formula that is impossible, after all, each SQL query statement, the results of the key, and the table field environment are different. The great God, who can reach the optimization method by looking at SQL query statements, must be a veteran who has carefully studied SQL query optimization and has had several years of optimization experience. Haha, I'm just a little rookie.

We recommend that you are learning SQL query optimization children's shoes: Do not just look, to more code, multi-test, various fields of environmental testing, various data volume level testing.

These are some of their own summary, perhaps some deficiencies. After all, you are just a rookie, and not the direction of the DBA, if you find that there is a shortage of places, or the wrong place, please you can ask.

MySQL Query optimization

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More