MySQL Query optimization tips

Last Update:2016-04-29 Source: Internet

Author: User

Tags mysql query mysql query optimization

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

MySQL Query statement optimization

Index optimization, query optimization, query caching, Server Setup optimization, operating system and hardware optimization, application-level optimization (Web server, caching) and more. The record of the optimization techniques here is more suitable for developers, are collected from the network and their own collation, mainly query statements above the optimization, other aspects of the optimization techniques are not recorded here.
Cost Metrics for queries:
Execution time
Number of rows checked
Number of rows returned
Several guidelines for indexing:
(1), reasonable index can speed up the efficiency of data reading, unreasonable indexing will slow down the response speed of the database.
(2), the more indexes, the slower the data is updated.
(3), try to use MyISAM as the engine when using the index (because MySQL btree storage index), rather than InnoDB. But MyISAM does not support transcation.
(4), when your program and database structure/SQL statements have been optimized to the extent that can not be optimized, and the program bottleneck does not solve the problem, that is, should consider the use of a distributed cache system such as memcached.
(5), habit and force yourself to use explain to analyze the performance of your SQL statement.
one, the optimization of the Count
For example: Calculate a city with an ID greater than 5
(1). Select COUNT (*) from world.city where ID > 5;
(2). Select (SELECT COUNT () from world.city) –count () from world.city where ID <= 5;
A statement requires more rows to be scanned than the B statement when the number of rows exceeds 11, and the B statement scans 6 rows, in which case the B statement is more efficient than the a statement. Direct SELECT COUNT (*) from world.city is faster when there is no where statement, because MySQL always knows the number of rows in the table.
second, avoid the use of incompatible data types
For example, float and int, char and varchar, binary, and varbinary are incompatible. Incompatible data types may make the optimizer unable to perform some optimizations that could otherwise have been performed.
In the program, to ensure that the implementation of the function on the basis of minimizing access to the database, through the search parameters, minimize the number of access to the table, minimize the result set, thereby reducing the network burden, can separate the operation as far as possible separate processing, improve the response speed of each time; When using SQL in the Data window, Try to put the index used in the first column of the selection; the structure of the algorithm is as simple as possible; when querying, do not use wildcards such as the SELECT * from T1 statement, use a few columns to select a few columns such as: Select Col1,col2 from T1 , as far as possible, limit the number of result sets as possible: SELECT TOP col1,col2,col3 from T1, because in some cases the user does not need that much data. Rather than using database cursors in your app, cursors are a useful tool, but they require more overhead than regular, set-oriented SQL statements, and fetch data in a specific order.
third, the operation on the index field will invalidate the index
Try to avoid a function or expression on a field in the Where clause, which causes the engine to discard the full table scan using the index. Such as:
SELECT * from T1 where f1/2=100 should read: SELECT * from T1 where f1=100*2
Iv. Avoid using! = or ">, is null or is not NULL,
Because this makes the system unusable with indexes, it can only search the data in the table directly. For example: The Select ID from the employee WHERE ID! = "B%" Optimizer will not be able to determine the number of rows in the fatal row by index, so you need to search all rows of the table. In the in statement can use the EXISTS statement instead of the exists.
v. Use numeric fields as much as possible
Some developers and database managers like to put fields that contain numeric information
Designed as a character type, which reduces the performance of queries and connections and increases storage overhead. This is because the engine is processing the query and connecting back to each character in a string by comparison, and for a digital type it only needs to be compared once.
vi. Rational Use of exists,not exists clauses
As shown below:
(1). SELECT SUM (T1. C1) from T1 where (SELECT COUNT (*) from T2 where t2.c2=t1.c2>0)
(2). SELECT SUM (T1. C1) from T1where EXISTS (SELECT * from T2 WHERE T2. C2=t1. C2)
Both produce the same result, but the latter is obviously more efficient than the former. Because the latter does not produce a large number of locked table scans or index scans. If you want to verify that there is a record in the table, do not use COUNT (*) as inefficient and waste server resources. Can be replaced with exists. Such as:
IF (SELECT COUNT () from table_name WHERE column_name = ' xxx ') can be written as: IF EXISTS (SELECTFrom table_name WHERE column_name = ' xxx ')
If you can use between, do not use in
Eight, can use the distinct without GROUP by
Nine, try not to use the SELECT INTO statement. The SELECT into statement causes the table to lock and prevent other users from accessing the table
10. Force the query optimizer to use an index if necessary
SELECT * from T1 WHERE nextprocess = 1 and ProcessID in (8,32,45) changed to:
SELECT * from T1 (INDEX = ix_processid) WHERE nextprocess = 1 and ProcessID in (8,32,45)
The query optimizer will forcibly use the index Ix_processid to execute the query.
11. Eliminate sequential access to large table row data
Although there are indexes on all the check columns, some forms of where clauses force the optimizer to use sequential access. Such as:
SELECT * FROM Orders WHERE (customer_num=104 and order_num>1001) OR order_num=1008
The workaround can use the set to avoid sequential access:
SELECT * FROM orders where customer_num=104 and order_num>1001 UNION SELECT * from orders where order_num=1008
This enables the query to be processed using the index path. "The jacking data result set is many, but the result set is not large after the query condition is limited, the later statement is fast"
12, try to avoid in the indexed character data, the use of non-beginning letter search. This also makes the engine unusable with indexes
See the following example:
SELECT * from T1 WHERE NAME like '%l% '
SELECT * from T1 WHERE substing (name,2,1) = ' L '
SELECT * from T1 WHERE NAME like ' l% '
Even though the name field is indexed, the first two queries are still unable to take advantage of the indexing to speed up the operation, and the engine has to perform the task by one-by-one operations on all tables. And the third query can use the index to speed up the operation, do not habitually use '%l% ' this way (will result in full table scan), if you can use ' l% ' relatively better;
13, although the update, DELETE statement is basically fixed, but also to the UPDATE statement to the point of recommendation
(1). Try not to modify the primary key field.
(2). When modifying a varchar field, try to replace it with values of the same length content.
(3). Minimize the update operation for the table that contains the update trigger.
(4). Avoid the columns that update will replicate to other databases.
(5). Avoid the update of columns with many indexes.
(6). Avoid update columns in the WHERE clause condition.
14, you can use union all do not use Union
UNION all does not execute the SELECT DISTINCT function, which reduces a lot of unnecessary resources
Using union across multiple different databases is an interesting optimization method, and union returns data from two unrelated tables, which means that no duplicate rows will occur and the data must be sorted, and we know that sorting is very resource-intensive, especially for large tables.
Union ALL can be much faster if you already know that your data does not include duplicate rows, or if you don't care if duplicate rows appear, in both cases it is more appropriate to use UNION all. In addition, you can use some methods in your application logic to avoid duplicate rows, so that the Union all and union return the same results, but the union all is not sorted.
15. field Data type optimization
(1). Avoid the use of NULL type: null for most databases require special handling, MySQL is no exception, it requires more code, more checks and special indexing logic, some developers do not realize that when creating a table null is the default value, but most of the time should use not NULL, or use a special value, such as 0,-1, as the default value.
(2). Use smaller fields whenever possible, MySQL reads data from disk into memory, then reads it with CPU cycles and disk I/O, which means smaller data types take up less space, read from disk or package to memory more efficiently, but don't be too persistent in reducing data types. There is no room for any future changes to the application. Modifying the table will require refactoring, which can indirectly lead to changes in the code, which is a headache, so you need to find a balance point.
(3). Priority use of fixed length type
17, about the large data volume limit distribution optimization (when the offset is particularly large, limit efficiency will be very low)
Attach a simple technique to improve the efficiency of limit, in the overlay index (the overlay index in layman's words is to use only to read the index when the select to get the data, do not have to make two select related tables) on the offset, rather than the entire row of data offset. You can join the data that is extracted from the overlay index and the entire row of data, and then get the columns you need to be more efficient and look at the following query:
Mysql> Select film_id, description from Sakila.film ORDER by title limit 50, 5;
If the table is very large, this query is best written as follows:
Mysql> Select film.film_id, film.description from Sakila.film
INNER JOIN (select film_id from Sakila.film ORDER by title Liimit 50,5) as film Usinig (FILM_ID);
18. If you insert multiple data into the same table at one time
such as the following statement:
Insert into person (name,age) VALUES (' Xboy ', 14);
Insert into person (name,age) VALUES (' Xgirl ', 15);
Insert into person (name,age) VALUES (' Nia ', 19);
It will be more efficient to put it into a statement.
Insert into person (name,age) VALUES (' Xboy ', +), (' Xgirl ', +), (' Nia ', 19);
19, do not place the index on the selected field, this is meaningless. The index should be placed reasonably on the statement of the conditional selection, such as Where,order by
SELECT id,title,content,cat_id from article WHERE cat_id = 1;
The above statement, you put an index on the id/title/content is meaningless, there is no optimization for this statement. But if you place an index on the foreign key cat_id, it will work quite a lot.
20. mysql optimization for ORDER by statement
(1). Index optimization for the ORDER by + limit combination. If an SQL statement is like:
SELECT [Column1],[column2],.... From [TABLE] ORDER by [sort] LIMIT [offset],[limit];
This SQL statement optimization is relatively simple and can be indexed on the [Sort] field.
(2). WHERE + ORDER by + limit combination index optimization, shaped like:
SELECT [Column1],[column2],.... From [TABLE] WHERE [COLUMNX] = [VALUE] ORDER by [sort] LIMIT [offset],[limit];
This statement, if you still use the first example of the method of indexing, although you can use the index, but not high efficiency. A more efficient approach is to create a federated index (Columnx,sort)
(3). WHERE + in + the index optimization of the ORDER by + limit combination, shaped like:
SELECT [Column1],[column2],.... From [TABLE] WHERE [ColumnX] In ([Value1],[value2],...) ORDER by [sort] LIMIT [offset],[limit];
This statement if you use the method of indexing in the second example, you will not get the desired effect (only the using Index,where on [sort] is the using where;using filesort), the reason is that there are multiple corresponding values for COLUMNX.
At present, the elder brother also found a better way to wait for expert advice.
(4). Where+order by multiple fields +limit, such as:
SELECT * from [table] WHERE uid=1 ORDER x, y LIMIT 0, 10;
For this statement, you may be adding an index such as this: (X,y,uid). But actually the better effect is (uid,x,y). This is caused by the mechanism of MySQL processing sorting.

MySQL Query optimization tips

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More