Mysql Query Statement optimization techniques _mysql

Source: Internet
Author: User
Tags mysql query

Index optimization, query optimization, query caching, Server Setup optimization, operating system and hardware optimization, application-level optimization (Web server, caching), and more. Here's the record of the optimization techniques more suitable for developers, are collected from the network and the collation of their own, mainly query the above optimization, the other level of optimization techniques do not record in this.

Cost metrics for the query:

Execution time

Number of rows checked

Number of rows returned

Several guidelines for indexing:

(1) Reasonable indexing can speed up the efficiency of data reading, but it will slow down the response of the database.

(2), the more indexes, the slower the speed of updating data.

(3) Try to use the index when using MyISAM as engine (because MySQL stores the index in btree) instead of InnoDB. But MyISAM does not support

Transcation.

(4), when your program and database structure/SQL statements have been optimized to the extent that can not be optimized, and program bottlenecks can not be resolved smoothly, that is, should consider the use of such as memcached such a distributed caching system.

(5), habit and force yourself to use explain to analyze the performance of your SQL statements.

One, the optimization of Count

For example, to compute a city with IDs greater than 5

Select COUNT (*) from world.city where ID >;
Select (select count () from world.city) –count () from world.city where ID <=;

The A statement needs to scan more rows than the B statement when the number of rows exceeds 11 lines, and the B statement scans 6 rows, in which case the B statement is more efficient than the a statement. Direct SELECT COUNT (*) from world.city is faster when there is no where statement, because MySQL always knows the number of rows in the table.

Ii. Avoid incompatible data types

For example, float and int, char and varchar, binary, and varbinary are incompatible. Incompatibility of data types may make the optimizer unable to perform some optimizations that could have been done.

In the program, to ensure that, on the basis of the implementation function, minimizing the number of accesses to the database, minimizing the number of accesses to the table by searching the parameters, minimising the result set, and reducing the burden of the network; the ability to separate the operations as much as possible, to improve the response rate each time; When you use SQL in a data window, Try to put the index in the selected first column, the structure of the algorithm as simple as possible, when querying, do not use wildcard characters such as SELECT * FROM T1 statements, use a few columns to select a few columns such as: Select Col1,col2 from T1 , as far as possible, limit the number of result set rows, such as: SELECT top-col1,col2,col3 from T1, because in some cases users do not need so much data. Instead of using database cursors in your application, cursors are a useful tool, but require more overhead than using regular, set-oriented SQL statements, and fetching data in a specific order.

Third, the index field on the operation will invalidate the index

Try to avoid a function or expression operation on a field in the Where clause, which causes the engine to discard the use of the index for a full table scan. Such as:

SELECT * from T1 where f1/2=100 should be changed to: SELECT * from T1 where f1=100*2

Avoid using operators such as!= or ">", is null, or isn't null, in, and not

Because this causes the system to be unable to use the index, it can only search the data in the table directly. For example, the Select ID from employee WHERE ID!= the "b%" optimizer will not be able to determine the number of rows to be fatal through the index, so you need to search all rows of the table. In the in statement can be replaced by the EXISTS statement with the exists.

V. Use numeric fields as much as possible

Some developers and database administrators like to put fields that contain numeric information

is designed to be character-type, which reduces the performance of queries and connections and increases storage overhead. This is because the engine compares each character of the string in the processing of the query and the connection back, and it is enough for the numeric type to compare it once.

Vi. Rational use of exists,not exists clause

As shown below:

 SELECT SUM (T1. C1) from T1 where (SELECT COUNT (*) from T2 where T2. C2=t1. c2>0)
 SELECT SUM (T1. C1) from T1where EXISTS (SELECT * from T2 WHERE T2. C2=t1. C2)

The two produce the same result, but the latter is obviously more efficient than the former. Because the latter does not produce a large number of locked table scans or index scans. If you want to check whether there is a record in the table, do not use COUNT (*) as inefficient, and waste server resources. You can use exists instead. Such as:

if (select COUNT () from table_name WHERE column_name = ' xxx ') can be written as: if EXISTS (SELECT from table_name WHERE column_name = ' x xx ')

If you can use between, do not use in

Eight, can use distinct without GROUP by

Nine, try not to use the SELECT INTO statement .

The SELECT into statement causes table locking to prevent other users from accessing the table

Force the query optimizer to use an index if necessary

SELECT * from T1 where nextprocess = 1 and ProcessID in (8,32,45) to:
select * from T1 (INDEX = ix_processid) where n extprocess = 1 and ProcessID in (8,32,45)

The query optimizer will force the query to execute using the index IX_PROCESSID.

Xi. elimination of sequential access to large table row data

Although there are indexes on all of the check columns, some forms of the WHERE clause force the optimizer to use sequential access. Such as:

SELECT * FROM Orders WHERE (customer_num=104 and order_num>1001) OR order_num=1008

Workaround you can use the and set to avoid sequential access:

SELECT * FROM orders where customer_num=104 and order_num>1001 UNION SELECT * from orders where order_num=1008

This allows the query to be processed using the index path. "Jacking a lot of data result sets, but when the query condition is limited and the result set is not large, the following statement is fast."

12, as far as possible to avoid in the indexed character data, using a non-beginning letter search.

This also makes it impossible for the engine to take advantage of indexes

See the following example:

SELECT * from T1 where name like '%l% '
select * from T1 WHERE substing (name,2,1) = ' l '
SELECT * from T1 where name L IKE ' l% '

Even if the Name field is indexed, the first two queries are still unable to use the index to speed up the operation, and the engine has to do a single operation on all of the data in the entire table. And the third query can use the index to speed up the operation, do not habitually use '%l% ' this way (will cause full table scan), if you can use ' l% ' relatively better;

13, although the writing of update, DELETE statement is basically fixed, but still give some suggestions to update statements

(1). Try not to modify the primary key field.

(2). When modifying varchar fields, try to replace them with values of the same length content.

(3). Minimize the update operation for the table containing the update trigger.

(4). Avoid the columns that update will replicate to other databases.

(5). Avoid an update with columns with many indexes.

(6). Avoid the columns of update in the WHERE clause condition.

14, can use union all do not use union

UNION all does not perform the SELECT DISTINCT function, which reduces a lot of unnecessary resources

Using union when crossing multiple different databases is an interesting optimization method, the union returns data from two unrelated tables, which means that no duplicate rows are present and the data must be sorted, and we know that sorting is very resource-intensive, especially the sort of large tables.

UNION all can greatly speed up, if you already know your data will not include duplicate rows, or you do not care whether there will be duplicate rows, in both cases using union all is more appropriate. In addition, you can use some method in the application logic to avoid duplicate rows so that union all and union return the same result, but union all does not sort.

Data type optimization of field

(1). Avoid the use of NULL types: null for most databases require special processing, MySQL is no exception, it requires more code, more checks and special indexing logic, some developers are completely unaware that the creation of a table null is the default value, but most of the time should use not NULL, or use a special value, such as 0,-1 as the default value.

(2). Use smaller fields whenever possible, MySQL is stored in memory after reading data from disk, and then reads it using CPU cycles and disk I/O, which means that the smaller the data type takes up less space, the more efficient it is to read from disk or package to memory, but not too persistent to reduce the data type, There will be no room for any future changes in the application. Modifying a table will require refactoring, which can indirectly cause code changes, which is a headache, so you need to find a balance.

(3). Priority to use fixed-length type

17, on the large amount of data limit distribution optimization (when the offset is particularly large, limit efficiency will be very low)

Attached is a simple technique to improve limit efficiency by overwriting the index (in layman's terms, simply reading the index in the select and fetching the data without having to do two select related tables), instead of offsetting the entire row of data. It is more efficient to join the data extracted from the overlay index and the full row of data, and then get the columns you want, and look at the following query:

Mysql> Select film_id, description from Sakila.film ORDER by title limit 50, 5;

If the table is very large, this query is best written as follows:

Mysql> Select film.film_id, film.description from Sakila.film
inner join (select film_id to Sakila.film ORDER by T Itle Liimit 50,5) as film Usinig (FILM_ID);

18, in the program if the same table to insert more than one data

For example, the following statement:

Insert into person (name,age) VALUES (' Xboy ');
Insert into person (name,age) VALUES (' Xgirl ');
Insert into person (name,age) VALUES (' Nia ', 19);

It will be more efficient to spell it out as a single statement.

Insert into person (name,age) VALUES (' Xboy ',), (' Xgirl ',), (' Nia ', 19);

19, do not place the index on the selected field, this is meaningless.

You should place the index reasonably on the statement that is selected for the condition, such as Where,order by

SELECT id,title,content,cat_id from article WHERE cat_id = 1;

The above statement, you put the index on the id/title/content is meaningless, there is no optimization for this statement. But if you put an index on the foreign key cat_id, the effect is quite large.

20. mysql optimization of ORDER BY statement

(1). The index optimization of the order by + limit combination. If an SQL statement is shaped like this:

SELECT [Column1],[column2],.... From [TABLE] order by [sort] LIMIT [offset],[limit];

This SQL statement optimization is simpler, and indexing is done on the [Sort] field.

(2). where + order by + Limit index optimization, shaped like:

SELECT [Column1],[column2],.... From [TABLE] WHERE [COLUMNX] = [VALUE] ORDER by [sort] LIMIT [offset],[limit];

This statement, if you still use the first example of the method of indexing, although the index can be used, but the efficiency is not high. A more efficient approach is to create a federated index (Columnx,sort)

(3). where + in + ORDER by + Limit index optimization, shaped like:

SELECT [Column1],[column2],.... From [TABLE] WHERE [ColumnX] In ([Value1],[value2],...) Order BY [sort] LIMIT [offset],[limit];

This statement, if you use the method of indexing in the second example, will not get the desired effect (only on [sort] is the using index,where there is a using where;using filesort), on the grounds that there are several corresponding COLUMNX values.

At present, the elder brother Wood has found a more excellent way, waiting for the master advice.

(4). Where+order by multiple field +limit, such as:

SELECT * from [table] WHERE uid=1 order x,y LIMIT 0, 10;

For this statement, you might add one such index: (X,Y,UID). But actually the better effect is (uid,x,y). This is caused by the mechanism by which MySQL handles sorting.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.