SQL Server mass data query code optimization and recommendations

Last Update:2017-07-16 Source: Internet

Author: User

Tags getdate

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Avoid null-value inference of the field in the WHERE clause, otherwise it will cause the engine to discard the index
Full table scan, such as:
Select ID from t where num is null

To set the default value of 0 on NUM, make sure that the NUM column in the table does not have a null value, and then query:
Select ID from t where num=0

2. Try to avoid using the! = or <> operator in the WHERE clause, or discard the engine for full table sweep using the index
Stroke The optimizer will not be able to determine the number of rows in the fatal row by index, so it is necessary to search all rows of the table.

3. You should try to avoid using or in the WHERE clause to join the condition. Doing so will cause the engine to abandon the use of the index
Full table scan, such as:
Select ID from t where num=10 or num=20

Able to query like this:
Select ID from t where num=10
UNION ALL
Select ID from t where num=20

4.in and not in are also used with caution, because in causes the system to not use the index, but only the data in the table can be searched directly. Such as:
Select ID from t where num in

For a continuous number. If you can use between, do not use in:
Select ID from t where num between 1 and 3

5. Try to avoid searching in indexed character data using non-heading letters.

This also makes the engine unusable with indexes.

See for example the following examples:
SELECT * from T1 WHERE NAME like '%l% '
SELECT * from T1 WHERE substing (name,2,1) = ' L '
SELECT * from T1 WHERE NAME like ' l% '
Even if the NAME field is indexed. The first two queries are still unable to use the index to speed up the operation, the engine has to correct all
All data of the table is completed by operation. The third query can use an index to speed up operations.

6. Force the query optimizer to use an index if necessary. Using a parameter in the WHERE clause can also result in a full table scan.
Because SQL only parses local variables at execution time, the optimizer cannot defer the selection of the access plan to the execution
, it must be selected at compile time. However Suppose you build an access plan at compile time, the value of the variable is still unknown.
Therefore, the input cannot be selected as an index. The following statement will perform a full table scan:
Select ID from t where [email protected]
To force the query to use the index instead:
Select ID from T with (index name) where [email protected]

7. You should try to avoid expression operations on the field in the Where clause, which will cause the engine to discard the use of the index
Full table scan. Such as:
SELECT * from T1 WHERE f1/2=100
should read:
SELECT * from T1 WHERE f1=100*2
SELECT * from RECORD WHERE SUBSTRING (card_no,1,4) = ' 5378 '
should read:
SELECT * from RECORD WHERE card_no like ' 5,378% '
SELECT Member_number, first_name, last_name from members
WHERE DATEDIFF (Yy,datofbirth,getdate ()) > 21

should read:
SELECT Member_number, first_name, last_name from members
WHERE dateOfBirth < DATEADD (Yy,-21,getdate ())
That is, no matter what the operation of the column will result in a table scan, it contains database functions, calculation expressions and so on, the query should be
To move the operation to the right of the equals sign.

8. You should try to avoid function operations on the field in the Where clause, which will cause the engine to discard the use of the index for the full table
Scanning. Such as:
Select ID from t where substring (name,1,3) = ' abc '--name ID starting with ABC
Select ID from t where DATEDIFF (day,createdate, ' 2005-11-30 ') =0--' 2005-11-30 ' generated ID
should read:
Select ID from t where name like ' abc% '
Select ID from t where createdate>= ' 2005-11-30 ' and createdate< ' 2005-12-1 '

9. Do not perform functions, arithmetic operations, or other expression operations on the left side of the "=" in the WHERE clause, or the system will be
Indexes can not be used correctly.

10. When using an indexed field as a condition. Assume that the index is a composite index. Then the first in the index must be used
field as a condition to ensure that the system uses the index, otherwise the index will not be used. And as far as possible, let
The field order coincides with the index order.

11. It is a good choice to use exists very often:
Select num from a where num in (select num from B)
Replace with the following statement:
Select num from a where exists (select 1 from b where num=a.num)
SELECT SUM (T1. C1) from T1 WHERE (
(SELECT COUNT (*) from T2 WHERE t2.c2=t1.c2>0)
SELECT SUM (T1. C1) from T1where EXISTS (
SELECT * from T2 WHERE T2. C2=t1. C2)
Both produce the same result, but the latter is obviously more efficient than the former. Because the latter does not produce a large number of locked-out table sweeps
or index Scan.

Suppose you want to verify that there is a record in the table, do not use COUNT (*) as inefficient. and waste server resources.

can be replaced with exists. such as:
IF (SELECT COUNT (*) from table_name WHERE column_name = ' xxx ')
can be written as: &NBSP;
if EXISTS (SELECT * FROM table_name WHERE column_name = ' xxx ')
It is often necessary to write a t_sql statement comparing a parent result set and a child result set. To find out if there are in the parent result set,
has records that are not in the child result set, such as:
SELECT A.hdr_key from hdr_tbl a----tbl a indicates that TBL replaces &N with alias a Bsp
WHERE not EXISTS (SELECT * from dtl_tbl b WHERE a.hdr_key = b.hdr_key)
Select A.hdr_key from HD R_tbl a
Left JOIN dtl_tbl b in a.hdr_key = B.hdr_key WHERE B.hdr_key is NULL
SELECT hdr_ Key from Hdr_tbl
WHERE Hdr_key is not in (SELECT Hdr_key from dtl_tbl)
three ways to get the same correct result, but the effect The rate is reduced in turn. &NBSP

12. Use table variables instead of temporary tables whenever possible. Assuming that the table variable includes a large amount of data, be aware that the index is limited (only
Primary key index). &NBSP

13. Avoid frequent creation and deletion of temporary tables. To reduce the consumption of system table resources.

14. The temporary table is not unusable. Using them appropriately can make certain routines more efficient, such as. When there is a need to repeatedly draw
When you use a large table or a dataset in a table frequently. But. For one-time events, it is best to use an export table.

15. When creating a temporary table. Assuming a very large amount of disposable data, you can use SELECT INTO instead of Create
Table, avoid causing a lot of log to speed up, assuming that the amount of data is small. To mitigate the resources of the system tables, you should first
CREATE TABLE. Then insert.

16. Assuming that a temporary table is used, be sure to explicitly delete all temporary tables at the end of the stored procedure, first truncate
Table, then drop table, which avoids longer locking of the system tables.

17. Set NOCOUNT on at the beginning of all stored procedures and triggers, set at the end
NOCOUNT OFF.

No need to send done_in_proc to client after each statement that runs stored procedures and triggers
News.

18. Try to avoid large transaction operations. Improve the system concurrency capability.

19. Try to avoid returning to the client big data volume, if the data volume is too large, should consider whether the corresponding demand is reasonable.

20. Avoid using incompatible data types. such as float and int, char and varchar, binary, and varbinary are not
Compatible with.

Incompatible data types may make it impossible for the optimizer to run some optimizations that would otherwise be possible. Like what:
SELECT name from employee WHERE salary > 60000
In this statement, such as the salary field is money type, the optimizer is very difficult to optimize, because 60000 is an integral type
Number. We should convert the integral type into a coin type when programming, rather than wait for the execution to convert.

21. Make full use of the connection conditions. In some cases. There may be more than one connection condition between the two tables. At this time in where
clause, it is possible to greatly improve the query speed by writing the connection conditions in full.
Cases:
SELECT SUM (A.amount) from account a,card B WHERE a.card_no = b.card_no
SELECT SUM (A.amount) from account a,card B WHERE a.card_no = B.card_no and
A.account_no=b.account_no
The second sentence will run much faster than the first sentence.

22. Use the view to speed up the query
Sorting a subset of tables and creating views can sometimes speed up queries. It helps to avoid multiple sorting operations. and
It also simplifies the work of the optimizer in other ways.

For example:
SELECT cust.name,rcvbles.balance,?? Other columns &NBSP,
from Cust. Rcvbles
WHERE cust.customer_id = rcvlbes.customer_id
and rcvblls.balance>0 & nbsp
and Cust.postcode> "98000"
ORDER by Cust.name &NBSP;
Assuming that the query is to be run more than once, Able to find all unpaid customers in a single view,
and sort by customer's name:
CREATE VIEW DBO. V_cust_rcvlbes
as
SELECT cust.name,rcvbles.balance,?? Other columns &NBSP,
from Cust. Rcvbles
WHERE cust.customer_id = rcvlbes.customer_id
and rcvblls.balance>0 & nbsp
ORDER by Cust.name

then query in the view in the following way:
SELECT * from V_cust_rcvlbes &N Bsp
WHERE postcode> "98000"
The rows in the view are less than the rows in the primary table, and the physical order is the required order, reducing disk I/O, so check
The workload can be significantly reduced.

23, you can use distinct without GROUP by
SELECT OrderID from Details WHERE UnitPrice > Ten GROUP by OrderID
Can be changed to:
SELECT DISTINCT OrderID from Details WHERE UnitPrice > 10

24. Use UNION ALL to not use Union
UNION all does not run the Select DISTINCT function, which reduces very many unnecessary resources

25. Try not to use the SELECT INTO statement.

The SELECT inot statement causes the table to be locked. Prevent other users from visiting the table.

What we mentioned above are some of the major considerations for improving query speed, but in many other cases, it is often necessary to repeat
Experiment with different statements to get the best solution. The best way to do this, of course, is to look at the SQL language that implements the same function.
Which run time is the least. However, the database assumes that the amount of data is very small, is less than the comparison. You can then use the view
Line plan, namely: the implementation of the same function of multiple SQL statements to the Query Analyzer, according to Ctrl+l see the use of the cable
And the number of scans (both of which have the greatest impact on performance), the overall cost percentage can be seen.

SQL Server mass data query code optimization and recommendations

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More