Mysql statement optimization Summary (1), mysql statement optimization Summary

Last Update:2015-01-30 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

SQL statement optimization and Indexing

1. Innerjoin and left join, right join, subquery

A. inner join is also called an equijoin, while left/rightjoin is an outer join.

Select a. id, A. name, B. id, B. name from a left join B on a. id = B. id;

Select a. id, A. name, B. id, B. name from a right join on B a. id = B. id;

Select a. id, A. name, B. id, B. name from a inner join on a. id = B. id;

It has been proved from many aspects that the inner join performance is relatively fast, because inner join is an equivalent join, and the number of rows returned may be relatively small. However, we should remember that some statements are implicitly using equivalent connections, such:

Select a. id, A. name, B. id, B. name from a, B where a. id = B. id;

Recommendation: Use inner join connection whenever possible

B. The performance of subqueries is slower than that of external connections. Try to replace subqueries with external connections.

Select * from A where exists (select * from B where id> = 3000 and A. uuid = B. uuid );

Table A has 100,000-level data, and table B has millions of tables. It takes about 2 seconds to execute the command on the local machine, we can use the explain command to check that the SUBQUERY is a dependence subquery. Mysql first executes the full table query for table A and then successively executes the SUBQUERY Based on the uuid, if the outer table is a large table, we can imagine that the query performance will be worse than this.

A simple optimization is to replace the subquery with the innerjoin method. The query statement is changed:

Select * from A inner join B using (uuid) where B. uuid >=3000;

This statement is executed for less than one second;

C. remember their order when using ON and WHERE, for example:

Select a. id, A. name, B. id, B. name from a left join B on a. id = B. id where B. NAME = 'xxx'

The execution process first filters out the number of rows in Table B after the ON statement is executed. However, the WHERE clause filters the records generated by the two connections.

However, I would like to remind you that the condition following ON can only filter the number of entries in table B, but the number of rows returned by the connection is the same as that in table. For example:

Select a. id, A. name, B. id, B. name from a left join B on a. id = B. id;

The number of records returned is the number of records in Table A. The condition after ON only filters the number of records in table B.

Select a. id, A. name, B. id, B. name from a, B where a. id = B. id

Returns the number of records that meet the condition A. id = B. id after Cartesian product.

D. when using JOIN, we should use a small result-driven result (the result of the left join table should be as small as possible. If there are conditions, we should put them on the left for processing, and right join is the same as reverse ), colleagues try to split queries involving multi-Table Union into multiple queries (the query efficiency of multiple tables is low, and it is easy to lock and block tables ). For example:

Select * from A left join B ona. id = B. ref_id where B. ref_id> 10;

Can be optimized to: select * from (select * from A wehre id> 10) T1 left join B onT1.id = B. ref_id;

2. Create an index to accelerate query performance.

A. when creating a composite index, it is best to place the field in the composite index in the where condition on the leftmost end of the composite index so that the index can be used, in order to improve the query.

B. Ensure that the connected index is of the same type, that is, the fields associated with table A and table B must be of the same type. All these types are indexed so that both tables can use indexes. If the types are different, at least one table cannot use indexes.

C. index, not only the primary key and unique key, but also any other column. When using like, one of the indexed field columns.

For example, select * from A name like 'xxx % ';

This SQL statement uses the name index (the premise is that the name has an index), and the following statement cannot use the index.

Select * from A name like '% XXX ';

Because '%' represents any character, % xxx does not know how to index, so the index cannot be used.

D. Composite Index

For example, there is a statement like this: select * from users where area = 'beijing' and age = 22;

If we create indexes on the area and age respectively, mysql queries can only use one index at a time. Therefore, full table scan improves the efficiency even though the index is not used, however, if you create a composite index on the "area" and "age" columns, the efficiency will be higher. If we create a composite index (area, age, salary), it is equivalent to creating (area, age, salary), (area, age), (area) three indexes are called the best left prefix feature. Therefore, when creating a composite index, we should place the columns most commonly used as restrictions on the leftmost and decrease in turn.

E. The index does not contain columns with NULL values.

As long as the column contains NULL values, it will not be included in the index (unless it is a unique value field, there can be a NULL value). If a column in the composite index contains NULL values, this column is invalid for this compound index. Therefore, do not set the default value of a field to NULL during database design.

F. Use short Indexes

Index a string or column. If possible, specify a prefix length. For example, if there is a CHAR (255) column and multiple values are unique within 10 or 20 characters of the money, do not index the entire column. Short indexes not only increase query speed, but also save disk space and I/O operations.

G. Sorting index

Mysql queries only use one index. Therefore, if an index is already used in the where clause, columns in order by will not use the index. Therefore, do not use sorting operations when the database's default sorting can meet the requirements. Try not to include the sorting of multiple columns. It is best to create a composite index for these columns if necessary.

3. Optimize the limit when querying millions of pages.

A. We usually use limit, for example:

Select * from A order by id limit 1, 10;

In this way, when there is very little table data, there will be no performance problems, if it reaches 10 million, such:

Select * from A order by id limit0000000, 10;

Although only 10 records are queried, the performance is unacceptable. Therefore, when the table data is large, we continue to use persistence layer frameworks such as hibernate, ibatis will have some performance problems, unless the persistence layer framework has optimized these big data tables.

B. When we encounter the above situation, we can use another statement optimization, such:

Select * from A where id> = (Select idfrom a limit 000,1) limit 10;

This is indeed much faster, but the premise is that the id field is indexed. Maybe this is not the optimal one. You can write it like this:

Select * from A where id between 000and 10000010;

This is more efficient.

4. Avoid the Select * command whenever possible

A. the more data read from the table, the slower the query. It will increase the disk operation time, or when the database server and the web server are independent, you will experience a very long network latency. It is only because the data is not required to be transmitted between servers.

5. Try not to use the by rand () command

A. There are many better ways to achieve this if you really need to display your results randomly. This function may execute the by rand () command for each independent row in the table, which consumes the processing power of the processor and then returns only one row.

6. Use limit 1 to get a unique row

A. When you want to query A table, you need to know that you need to view A row. You may want to query A unique record. You can use limit 1. to terminate the database engine and continue to scan the entire table or index, for example:

Select * from A where namelike '% XXX' limit 1;

In this way, as long as you query records that match like '% XXX', the engine will not continue to scan tables or indexes.

7. sort as few as possible

A. sorting operations consume A large amount of CPU resources, so reducing sorting can increase the cache hit rate.

8. Minimize OR

A. when multiple conditions in the where clause coexist with "or", the Mysql optimizer does not solve the optimization problem of its execution plan very well, coupled with the unique SQL and Storage layered architecture of mysql, the performance is relatively low. In many cases, union all or union is used (when necessary) instead of "or.

9. Try to replace union all with union

A. the main difference between union and union all is that the former needs to set two or more results and then perform the unique filtering operation. This will involve sorting and a large number of cpu operations, increase resource consumption and delay. So when we can confirm that there is no repeated result set or we don't care about the repeated result set, we should try to use union all instead of union.

10. Avoid type conversion

A. The "type conversion" mentioned here refers to the type conversion that occurs when the column field type in the where clause is inconsistent with the input parameter type. Manually convert the data by using the conversion function, which directly causes mysql to fail to use the index. If transformation is required, it is necessary to convert the input parameters.

11. do not perform operations on columns

A. As shown in the following figure: select * fromusers where YEAR (adddate) <2007; operations will be performed on each row, which causes index failure to scan the entire table, so we can change it:

Select * from users where adddate <'2014-01-01 ';

12. Do NOT use not in and <> operations whenever possible.

A. not in and <> operations do NOT use indexes, but scan the entire table. Not in can be replaced by not exists, and id <> 3 can be used with id> 3 or id <3; if not exists is a subquery, it can also be converted to external connections or equivalent connections as much as possible, it depends on the business logic of the specific SQL statement.

B. Convert NOT IN to LEFT JOIN, for example:

SELECT * FROM customerinfo WHERE CustomerIDNOT in (SELECT CustomerID FROM salesinfo );

Optimization:

SELECT * FROM customerinfo LEFT JOINsalesinfoON customerinfo. CustomerID = salesinfo. CustomerID WHEREsalesinfo. CustomerID is null;

13. Use batch insert to save interaction (preferably using stored procedures)

A. try to use insert into users (username, password) values ('test1', 'pass1'), ('test2', 'passs2'), ('test3', 'passs3 ');

14. Lock the table

A. although transactions are a good way to maintain database integrity, they sometimes affect database performance, especially in many application systems. because the database will be locked during transaction execution, other user requests can only wait until the transaction is settled. if a database system is used by only a few users, the impact of transactions will not become a major problem. However, if there are thousands of users accessing a database system at the same time, such as accessing an e-commerce website, A serious response delay occurs. in fact, in some cases, we can lock the table to achieve better performance. for example:

Lock table inventory write

Select quanity from inventory whereitem = 'book ';

...

Update inventory set quantity = 11 whereitem = 'book ';

Unlock tables;

Here, we use a select statement to retrieve the initial data. Through some calculations, we use the update statement to update the new value to the list. The lock table statement containing the write keyword ensures that no other access is allowed to insert, update, or delete the inventory before the unlock tables command is executed.

15. Create a view for multi-table join queries

A. multi-Table association may cause performance problems. We can create a view for multiple tables. In this case, we can increase data security by using views, you can only query and modify specified data. In addition, the logical independence of the table is improved, and the view can avoid the impact of changes in the original table structure.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Mysql statement optimization Summary (1), mysql statement optimization Summary

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Mysql statement optimization Summary (1), mysql statement optimization Summary

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support