MySQL handles massive amounts of data to make some ways to optimize query speed

Last Update:2017-02-19 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Because of the actual project involved, it is found that when the data volume of MySQL table reaches millions, the efficiency of normal SQL query decreases linearly, and the query speed is simply intolerable if the query condition in where is more. Once tested on a table containing 400多万条 records (indexed) to perform a conditional query, its query time unexpectedly up to 40 seconds, I believe such a high query latency, any user will be crazy. So how to improve the efficiency of SQL statement query is very important. The following is an extensive network of 30 kinds of SQL query statement optimization method: 1, should try to avoid using the! = or <> operator in the WHERE clause, otherwise the engine discards the use of the index for a full table scan. 2, to optimize the query, should try to avoid full table scan, first of all should consider the where and order by the columns involved in the index.3, should try to avoid in the where clause of the field null value judgment, otherwise it will cause the engine to abandon the use of indexes and full table scan, such as: Select ID from t where num is Null you can set the default value of 0 on NUM, make sure that the NUM column in the table does not have a null value, and then query: Select ID from t where num=04, try to avoid using or in the where clause To join the condition, otherwise it will cause the engine to discard full table scans using the index, such as: Select ID from t where num=10 or num=20 can query: Select ID from t where num=10union allselect ID The From T where num=205, the following query will also result in a full table scan: (cannot have a preceding percent) select ID from the where name like '%abc% ' to improve efficiency, you can consider full-text indexing. 6, in and not in also need to use caution, otherwise it will result in a full table scan, such as: Select ID from t where num in (three-to-three) for continuous values, can use between do not use in : Select ID from t where num in (7), if parameters are used in the where clause, it also causes a full table scan. Because SQL resolves local variables only at run time, the optimizer cannot defer the selection of access plans to run time; it must be selected at compile time. However, if an access plan is established at compile time, the value of the variable is still unknown and therefore cannot be selected as an input for the index. The following statement will perform a full table scan: The select id from where [email protected] can be changed to force the query to use the index: SELECT ID from the T with (index name) where [Email&nbs P;protected]8, you should try to avoid the expression of the field in the where clause, which will cause the engine to abandon using the index for full table scanning. For example: Select ID from t where num/2=100 should be changed to: Select ID from t where num=100*29, you should try to avoid function operations on the field in the Where clause, which will cause the engine to abandon using the index for a full table scan. such as: Select ID from t where substring (name,1,3) = ' abc ' –name idselect ID starting with ABC from t where DateDiff (Day,createdate, ' 2005-11-30′) =0– ' 2005-11-30′ generated ID should be changed to: Select ID from t where name like ' abc% ' select ID from t where createdate>= ' 2005-11- 30′and createdate< ' 2005-12-1′10, do not perform functions, arithmetic operations, or other expression operations on the left side of the ' = ' in the where clause, or the index may not be used correctly by the system. 11. When using an indexed field as a condition, if the index is a composite index, you must use the first field in the index as a condition to guarantee that the system uses the index, otherwise the index will not be used, and the field order should be consistent with the index order as much as possible. 12, do not write some meaningless queries, such as the need to generate an empty table structure: Select Col1,col2 into #t from T where 1=0 such code will not return any result set, but will consume system resources, should be changed to this: CREATE TABLE #t (...) 13, a lot of times with exists instead of in is a good choice: Select num from a where num in (select num from B) is replaced with the following statement: Select Num FR Om A Where exists (select 1 from b where num=a.num) 14, not all indexes are valid for the query, SQL is query-optimized based on the data in the table, and when there is a large number of data duplication in the index column, the SQL query may not take advantage of the index. As there are fields sex,male and female almost half of the table, even indexing on sex does not work for query efficiency. 15, the index is not the more the better, although the index can improve the corresponding select efficiency, but also reduce the efficiency of insert and update , because &NBSP;INSERT&NBSP, or update may rebuild the index, so how to build the index needs careful consideration, depending on the situation. The number of indexes on a table should not be more than 6, if too many you should consider whether some of the indexes that are not commonly used are necessary. 16. Update clustered index data columns should be avoided as much as possible, because the order of the clustered index data columns is the physical storage order of the table records, and once the column values are changed, the order of the entire table records is adjusted. Will cost a considerable amount of resources. If your application needs to update clustered index data columns frequently, you need to consider whether the index should be built as a clustered index. 17, try to use numeric fields, if only the value of the field is not designed as a character type, which will reduce the performance of query and connection, and increase storage overhead. This is because the engine compares each character in a string one at a time while processing queries and joins, and it is sufficient for a numeric type to be compared only once. 18, as far as possible to use varchar/nvarchar instead of char/nchar&nbsp, because the first variable length field storage space is small, can save storage space, second, for the query, In a relatively small field, the search efficiency is obviously higher. 19. Do not use select * from t&nbsp in any place, instead of "*" with a specific field list, do not return any fields that are not available. 20. Try to use table variables instead of temporary tables. If the table variable contains a large amount of data, be aware that the index is very limited (only the primary key index). 21. Avoid frequent creation and deletion of temporary tables to reduce the consumption of system table resources. 22. Temporary tables are not unusable, and they can be used appropriately to make certain routines more efficient, for example, when you need to repeatedly reference a dataset in a large table or a common table. However, for one-time events, it is best to use an export table. 23, in the new temporary table, if the disposable data volume is large, then you can use select into instead of create table, to avoid causing a large number of logs, to improve the speed, if the amount of data is small, in order to mitigate the resources of the system table, You should create table first, and then insert. 24. If the temporary table is used, make sure that all temporary tables are explicitly removed at the end of the stored procedure, truncate table&nbsp First, and then drop TABLE&NBSP, which avoids longer locking of the system tables. 25, avoid using cursors as much as possible, because cursors are inefficient and if the cursor is manipulating more than 10,000 rows of data, thenshould consider rewriting. 26. Before using a cursor-based method or temporal table method, you should first look for a set-based solution to solve the problem, and the set-based approach is usually more efficient. 27. As with temporary tables, cursors are not unusable. Using FAST_FORWARD cursors on small datasets is often preferable to other progressive processing methods, especially if you have to reference several tables to get the data you need. Routines that include "totals" in the result set are typically faster than using cursors. If development time permits, a cursor-based approach and a set-based approach can all be tried to see which method works better. 28. Set set NOCOUNT on&nbsp at the beginning of all stored procedures and triggers, and set set NOCOUNT off at the end. You do not have to send a DONE_IN_PROC message to the client after each statement that executes the stored procedure and trigger. 29, try to avoid the return of large data to the client, if the amount of data is too large, should consider whether the corresponding demand is reasonable. 30, try to avoid large transaction operation, improve the system concurrency ability.

MySQL processing massive amounts of data makes some way to optimize query speed

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

MySQL handles massive amounts of data to make some ways to optimize query speed

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

MySQL handles massive amounts of data to make some ways to optimize query speed

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support