Recently, due to the need for work, we began to focus on the relative optimization of select query statements for MySQL databases.
Because of the actual project involved, it is found that when the data volume of MySQL table reaches millions, the efficiency of normal SQL query decreases linearly, and the query speed is simply intolerable if the query condition in where is more. Once tested on a table containing 400多万条 records (indexed) to perform a conditional query, its query time unexpectedly up to 40 seconds, I believe such a high query latency, any user will be crazy. So how to improve the efficiency of SQL statement query is very important. Here are 30 ways to optimize SQL query statements that are widely circulated online:
1, should try to avoid where
using! = (also can be written as <>) operator in the clause, otherwise the engine discards the use of the index for a full table scan.
2, to optimize the query, should try to avoid the full table scan, first of all should consider in where
and order by
related to the column to establish the index.
3, should try to avoid in the where
clause to judge the value of the field null
, otherwise it will cause the engine to abandon the use of the index for a full table scan , such as:
Select from where is NULL
You can num
set the default value on 0
, make sure that the columns in the table num
do not have null
values, and then query:
Select from where num=0
4, try to avoid where
using in the clause or
to join the condition, otherwise it will cause the engine to abandon the use of the index for a full table scan, such as:
Select from where num=tenor num=
You can query this:
Select from where num=tenUnionallselect from where num=
5, the following query will also result in a full table scan: (Can not be placed before the percent sign)
Select from where like ' %abc% '
To be more efficient, consider full-text indexing.
6, in
and not in
also to use caution, otherwise it will result in a full table scan, such as:
Select from where inch (1,2,3)
For a continuous number, you can use between
it without using in
:
Select from where inch (1,2,3)
7. If you where
use parameters in clauses, you can also cause a full table scan. Because SQL resolves local variables only at run time, the optimizer cannot defer the selection of access plans to run time; it must be selected at compile time. However, if an access plan is established at compile time, the value of the variable is still unknown and therefore cannot be selected as an input for the index. The following statement will perform a full table scan:
Select from where num=@num
You can force the query to use the index instead:
Select from with (indexwhere num=@num
8, you should try to avoid the expression of the field in the where
clause operation, which will cause the engine to abandon the use of the index for a full table scan. Such as:
Select from where num/2=
should read:
Select from where num=2 *
9, should try to avoid the where
clause in the function of the field, which will cause the engine to abandon the use of the index for a full table scan. Such as:
SelectId fromTwhere substring(Name,1,3)=' abc ' –name an ID starting with ABCSelectId fromTwhere DateDiff( Day, CreateDate, '2005- One- -′)=0–’2005- One- -The ID generated by the
should read:
SelectId fromTwhereName like' ABC%'SelectId fromTwhereCreateDate>=’2005- One- -′ andCreateDate<’2005- A-1′
10. Do not where
=
perform functions, arithmetic operations, or other expression operations on the left side of the clause, or the index may not be used correctly by the system.
11. When using an indexed field as a condition, if the index is a composite index, you must use the first field in the index as a condition to guarantee that the system uses the index, otherwise the index will not be used, and the field order should be consistent with the index order as much as possible.
12, do not write some meaningless queries, such as the need to generate an empty table structure:
Select into from where 1 = 0
This type of code does not return any result sets, but consumes system resources and should be changed to this:
Create table #t (...)
13, a lot of times in exists
lieu in
is a good choice:
Select from where inch (Select from B)
Replace with the following statement:
Select from where exists (Select1fromwhere num=a.num)
14, not all indexes are valid for the query, SQL is based on the data in the table for query optimization, when the index column has a large number of data duplication, SQL query may not take advantage of the index, as if there are fields in the table, sex
male
female
almost half, then even in the sex
Indexing is also not useful for query efficiency.
15, the index is not the more the better, although the index can improve the corresponding select
efficiency, but also reduce insert
the update
efficiency, because insert
or when it is possible to rebuild the update
index, so how to build the index needs careful consideration, depending on the circumstances. The number of indexes on a table should not be more than 6, if too many you should consider whether some of the indexes that are not commonly used are necessary.
16. You should avoid updating the index data columns as much as possible, clustered
because clustered
the order of the indexed data columns is the physical storage order of the table records, which can consume considerable resources once the column values change to make the order of the entire table record. If the application needs to update clustered
the index data columns frequently, you need to consider whether the index should be built as an clustered
index.
17, try to use numeric fields, if only the value of the field is not designed as a character type, which will reduce the performance of query and connection, and increase storage overhead. This is because the engine compares each character in a string one at a time while processing queries and joins, and it is sufficient for a numeric type to be compared only once.
18, as far as possible use varchar
/ nvarchar
replace char
/ nchar
, because the first variable length field storage space is small, can save storage space, second for the query, in a relatively small field search efficiency is obviously higher.
19, do not use any place select * from t
, with a specific field list instead of " *
", do not return any fields that are not used.
20. Try to use table variables instead of temporary tables. If the table variable contains a large amount of data, be aware that the index is very limited (only the primary key index).
21. Avoid frequent creation and deletion of temporary tables to reduce the consumption of system table resources.
22. Temporary tables are not unusable, and they can be used appropriately to make certain routines more efficient, for example, when you need to repeatedly reference a dataset in a large table or a common table. However, for one-time events, it is best to use an export table.
23, in the new temporary table, if the disposable data volume is large, then you can use select into
instead create table
, to avoid causing a large number of logs to improve speed, if the amount of data is small, in order to mitigate the resources of the system table, create table
then insert
.
24. If a temporary table is used, make sure that all temporary tables are explicitly deleted at the end of the stored procedure, first truncate table
, and then drop table
, to avoid longer locking of the system tables.
25. Avoid using cursors as much as possible, because cursors are inefficient and should be considered for overwriting if the cursor is manipulating more than 10,000 rows of data.
26. Before using a cursor-based method or temporal table method, you should first look for a set-based solution to solve the problem, and the set-based approach is usually more efficient.
27. As with temporary tables, cursors are not unusable. Using cursors on small datasets FAST_FORWARD
is often preferable to other progressive processing methods, especially if you have to reference several tables to get the data you need. Routines that include "totals" in the result set are typically faster than using cursors. If development time permits, a cursor-based approach and a set-based approach can all be tried to see which method works better.
28. Set at the beginning of all stored procedures and triggers SET NOCOUNT ON
at the end SET NOCOUNT OFF
. You do not need to send a message to the client after each statement that executes the stored procedure and trigger DONE_IN_PROC
.
29, try to avoid the return of large data to the client, if the amount of data is too large, should consider whether the corresponding demand is reasonable.
30, try to avoid large transaction operation, improve the system concurrency ability.
Article from: Github
It program Lion
Links: http://www.imooc.com/article/1204
Source: MU-Class Network
Some ways to optimize query speed when MySQL processes massive data "go"