"Mysql" Big Data Processing optimization method

Source: Internet
Author: User

1. Try to avoid using the! = or <> operator in the WHERE clause, or discard the engine for a full table scan using the index.

2, to optimize the query, should try to avoid full table scan, first of all should consider the where and order by the columns involved in the index.

3, should try to avoid the null value of the field in the Where clause to judge, otherwise it will cause the engine to abandon the use of the index for a full table scan, such as:

Select  from where  is NULL
 可以在num上设置默认值0,确保表中num列没有null值,然后这样查询:
Select  from where num=0

4, try to avoid using or in the WHERE clause to join the condition, otherwise it will cause the engine to abandon the use of the index for a full table scan, such as:

Select  from where num=tenor num=
 可以这样查询:
Select  from where num=tenUnionallselectfromwhere num =  -

5, the following query will also result in a full table scan: (Can not be placed before the percent sign)

Select  from where  like '%C%'
若要提高效率,可以考虑全文检索。

6, in and not in also to use caution, otherwise it will cause a full table scan, such as:

Select  from where inch (1,2,3)
 对于连续的数值,能用 between 就不要用 in 了:
Select  from where between 1  and 3

7, if the use of parameters in the WHERE clause, also causes a full table scan. Because SQL resolves local variables only at run time, the optimizer cannot defer the selection of access plans to run time;

It must be selected at compile time. However, if an access plan is established at compile time, the value of the variable is still unknown and therefore cannot be selected as an input for the index. The following statement will perform a full table scan:

Select  from where num=@num
 可以改为强制查询使用索引:
Select  from  with (indexwhere num=@num

8. You should try to avoid expression operations on the field in the Where clause, which causes the engine to discard full table scans using the index. Such as:

Select  from where num/2=
 应改为:
Select  from where num=2 *

9, should try to avoid in the WHERE clause function operations on the field, which will cause the engine to abandon the use of the index for a full table scan. Such as:

 SelectId fromTwhere substring(Name,1,3)=' abc ' –name an ID starting with ABCSelectId fromTwhere DateDiff( Day, CreateDate, '2005- One- -′)=0–’2005- One- -The ID generated by the
 应改为:
 SelectId fromTwhereName like' ABC%'SelectId fromTwhereCreateDate>=’2005- One- -′ andCreateDate<’2005- A-1′

10. Do not perform functions, arithmetic operations, or other expression operations on the left side of "=" in the WHERE clause, or the index may not be used correctly by the system.

11. When using an indexed field as a condition, if the index is a composite index, you must use the first field in the index as a condition to guarantee that the system uses the index, otherwise the index will not be used, and the field order should be consistent with the index order as much as possible.

12, do not write some meaningless queries, such as the need to generate an empty table structure:

Select  into  from where 1 = 0
 这类代码不会返回任何结果集,但是会消耗系统资源的,应改成这样:
Create table #t (...)

13, a lot of times with exists instead of in is a good choice:

Select  from where inch (Select from B)
 用下面的语句替换:
Select  from where exists (Select1fromwhere num=a.num)

14, not all indexes are valid for the query, SQL is based on the data in the table to query optimization, when the index column has a large number of data duplication, SQL query may not use the index, such as the table has a field sex,male, female almost half, So even if you build an index on sex, it doesn't work for query efficiency.

15, the index is not the more the better, although the index can improve the efficiency of the corresponding select, but also reduce the efficiency of insert and UPDATE, because the INSERT or update when the index may be rebuilt, so how to build the index needs careful consideration, depending on the situation. The number of indexes on a table should not be more than 6, if too many you should consider whether some of the indexes that are not commonly used are necessary.

16. You should avoid updating clustered index data columns as much as possible, because the order of the clustered index data columns is the physical storage order of the table records, which can consume considerable resources once the column values change to the order in which the entire table is recorded. If your application needs to update clustered index data columns frequently, you need to consider whether the index should be built as a clustered index.

17, try to use numeric fields, if only the value of the field is not designed as a character type, which will reduce the performance of query and connection, and increase storage overhead. This is because the engine compares each character in a string one at a time while processing queries and joins, and it is sufficient for a numeric type to be compared only once.

18, as far as possible to use Varchar/nvarchar instead of Char/nchar, because the first variable long field storage space is small, you can save storage space, and secondly for the query, in a relatively small field search efficiency is obviously higher.

19. Do not use Select from t anywhere, replace "" with a specific field list, and do not return any fields that are not available.

20. Try to use table variables instead of temporary tables. If the table variable contains a large amount of data, be aware that the index is very limited (only the primary key index).

21. Avoid frequent creation and deletion of temporary tables to reduce the consumption of system table resources.

22. Temporary tables are not unusable, and they can be used appropriately to make certain routines more efficient, for example, when you need to repeatedly reference a dataset in a large table or a common table. However, for one-time events, it is best to use an export table.

23. When creating a temporary table, if you insert a large amount of data at one time, you can use SELECT INTO instead of CREATE table to avoid causing a lot of log to improve the speed, if the amount of data is small, in order to mitigate the resources of the system table, create table First, Then insert.

24. If a temporary table is used, be sure to explicitly delete all temporary tables at the end of the stored procedure, TRUNCATE table first, and then drop table, which avoids longer locking of the system tables.

25. Avoid using cursors as much as possible, because cursors are inefficient and should be considered for overwriting if the cursor is manipulating more than 10,000 rows of data.

26. Before using a cursor-based method or temporal table method, you should first look for a set-based solution to solve the problem, and the set-based approach is usually more efficient.

27. As with temporary tables, cursors are not unusable. Using Fast_forward cursors on small datasets is often preferable to other progressive processing methods, especially if you must reference several tables to obtain the required data. Routines that include "totals" in the result set are typically faster than using cursors. If development time permits, a cursor-based approach and a set-based approach can all be tried to see which method works better.

28. Set NOCOUNT on at the beginning of all stored procedures and triggers, set NOCOUNT OFF at the end. You do not need to send a DONE_IN_PROC message to the client after each statement that executes the stored procedure and trigger.

29, try to avoid the return of large data to the client, if the amount of data is too large, should consider whether the corresponding demand is reasonable.

30, try to avoid large transaction operation, improve the system concurrency ability.

"Mysql" Big Data Processing optimization method

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.