SQL Server SQL performance optimization Tips

SQL Server SQL performance optimization Tips _mssql

Last Update:2017-01-18 Source: Internet

Author: User

Tags joins time 0

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Select the most efficient table name order (valid only in the Rule-based optimizer)

The SQL Server parser processes the table names in the FROM clause in Right-to-left order, therefore, the last table in the FROM clause (driving table) is processed first, and in the case where multiple tables are included in the FROM clause, the table with the fewest number of records must be selected as the underlying table. When SQL Server processes multiple tables, it uses sorting and merging to connect them.

First, scan the first table (the last table in the FROM clause) and sort the records, then scan the second table (the last second table in the FROM clause), and finally merge all the records retrieved from the second table with the appropriate records in the first table.
For example: Table TAB1 16,384 Records table TAB2 5 records, select TAB2 as the base table (best method) Select COUNT (*) from TAB1,TAB2 execution time 0.96 seconds, select TAB2 as the underlying table (poor method) Select Count (*) from TAB2,TAB1 execution time 26.09 seconds;

If you have more than 3 table join queries, you will need to select the Crosstab table (intersection table) as the underlying table, which is the table referenced by the other tables

For example:

The EMP table describes the intersection of the location table and the category table

SELECT * 
from LOCATION L, 
CATEGORY C, 
EMP E 
WHERE e.emp_no BETWEEN 1000 and 
e.cat_no = C.cat_ NO

will be more efficient than the following SQL

SELECT * 
from EMP E, 
LOCATION L, 
CATEGORY C 
WHERE e.cat_no = c.cat_no and 
e.locn = L.LOCN

Connection order in the 2.WHERE clause

SQL Server parses the WHERE clause in a bottom-up order, according to which the connection between the tables must be written before the other where conditions, and the conditions that can filter out the maximum number of records must be written at the end of the WHERE clause
For example:
(Inefficient, execution time 156.3 seconds)

SELECT * FROM 
emp E 
WHERE SAL > 50000 and 
JOB = ' MANAGER ' 
< (select COUNT (*) from EMP wher E mgr=e.empno); 
(Efficient, execution time 10.6 seconds) 
SELECT * FROM 
emp E 
where < (select COUNT (*) from EMP where mgr=e.empno) and 
SAL > 50000

avoid the use of ' * ' in the 3.SELECT clause. When you want to list all columns in the SELECT clause, it is convenient to use a dynamic SQL column reference ' * ', which, unfortunately, is a very inefficient approach. In fact, SQL Server, in the process of parsing, converts ' * ' to all column names in turn, which is done by querying the data dictionary, which means it will take more time

4. Reduce the number of accesses to the database. when executing every SQL statement, SQL Server does a lot of work internally: Parsing SQL statements, estimating index utilization, binding variables, reading blocks, and so on
Thus, reducing the number of accesses to the database can actually reduce the workload of SQL Server, for example:
Here are three ways to retrieve employees with an employee number equal to 0342 or 0291
Method 1 (lowest effect)

SELECT Emp_name, SALARY, GRADE from 
EMP 
WHERE emp_no = 342;  
SELECT Emp_name, SALARY, GRADE from 
EMP

Method 2 (secondary low efficiency)

DECLARE 
CURSOR C1 (e_no number) is 
SELECT emp_name,salary,grade from 
EMP 
WHERE emp_no = e_no; 
BEGIN 
OPEN C1 (342); 
FETCH C1 into ...,...,...; 
... OPEN C1 (291); 
FETCH C1 into ...,...,...; 
... Close C1;

Method 2 (efficient)

SELECT A.emp_name, A.salary, A.grade, 
b.emp_name, B.salary, b.grade from 
emp A, emp B 
WHERE a.emp_no = 342

5. Use the Decode function to reduce processing time

Use the Decode function to avoid repeated scans of the same record or repeated connections to the same table

For example:

SELECT COUNT (*), SUM (SAL) from the 
EMP 
WHERE dept_no = ' 0020 ' and 
ename like ' smith% '; 
 
SELECT COUNT (*), SUM (SAL) from 
EMP 
WHERE dept_no = ' 0030 '

You can use the Decode function to get the same results efficiently.

SELECT count (DECODE (dept_no, ' 0020 ', ' x ', null)) D0020_count, 
count (DECODE (dept_no, ' 0030 ', ' x ', null)) d0030_ COUNT, 
sum (DECODE (dept_no, ' 0020 ', Sal, null)) D0020_sal, 
sum (DECODE (dept_no, 0030, Sal, null)) D0030_sal From 
EMP

' X ' means any one field
Similarly, the Decode function can also be applied to the group BY and ORDER BY clauses

6. Replace the HAVING clause with the WHERE clause

Avoid having a HAVING clause that will filter the result set only after retrieving all records, which requires sorting, statistics, and so on

If you can limit the number of records through the WHERE clause, you can reduce the overhead

For example:
Low efficiency

SELECT REGION, AVG (log_size) from 
LOCATION 
GROUP by REGION have 
REGION REGION '!= ' and 
SYDNEY ' REGION ' PERTH ' 
 
efficient 
SELECT REGION, AVG (log_size) from 
LOCATION 
WHERE REGION REGION!= ' SYDNEY ' and 
REGION!= ' PERTH '

7. Reduce the query to the table

In SQL statements that contain subqueries, pay special attention to reducing the query to the table
For example:

Low 
-Efficiency select Tab_name from 
TABLES 
where tab_name = (select Tab_name from 
tab_columns 
where VERSION = 604) and 
db_ver = (select Db_ver from 
tab_columns 
WHERE VERSION = 604) 
 
Efficient 
SELECT tab_n AME from 
the TABLES WHERE (tab_name, db_ver) = (SELECT tab_name, db_ver from 
tab_columns

Update multiple column examples:

Low-efficiency 
UPDATE EMP 
SET emp_cat = (select Max (CATEGORY) 
from emp_categories), 
sal_range = (select Max (Sal_ RANGE) from 
emp_categories) 
WHERE emp_dept = 0020; 
 
Efficient 
UPDATE EMP 
SET (emp_cat, Sal_range) = (SELECT max (CATEGORY), Max (Sal_range) from 
emp_categories)

8. Using the alias of the table (alias), when connecting multiple tables in an SQL statement, use the alias of the table and prefix the alias with each column, which reduces the time to parse and reduces the syntax errors caused by column ambiguity

9. Replace in with exists

In many base-table queries, it is often necessary to join another table in order to satisfy one condition

In this case, using EXISTS (or not EXISTS) usually increases the efficiency of the query

Low-Efficiency 
SELECT * FROM 
EMP (base table) 
WHERE EMPNO > 0 and 
DEPTNO in (select DEPTNO from 
DEPT 
where L OC = ' Melb ') 
 
efficient 
select * from 
EMP (base table) 
WHERE EMPNO > 0 and 
EXISTS (select ' X ' from 
dept< C27/>where DEPT. DEPTNO = EMP. DEPTNO

10. Use NOT exists instead of in

In a subquery, the NOT IN clause performs an internal sort and merge

In either case, not in is the least effective because it performs a full table traversal of the table in the subquery

To avoid using not in, we can rewrite it as an outer join (Outer joins) or not EXISTS

For example:

SELECT ... 
From the EMP 
WHERE dept_no not in (SELECT dept_no from 
DEPT

To improve efficiency rewrite to

High-efficiency 
SELECT ... 
From EMP A, DEPT B 
WHERE a.dept_no = b.dept (+) and 
B.dept_no is NULL and 
b.dept_cat (+) = ' A ' 
 
most efficient 
S 
elect ... From the EMP E 
where not EXISTS (SELECT ' X ' from 
DEPT D 
where d.dept_no = E.dept_no

11. Replace exists with table connection

In general, table joins are more efficient than exists

For example:

Select ename from 
EMP E 
where EXISTS (select ' X ' from 
DEPT 
where dept_no = E.dept_no and 
dept_cat = ' A '); 
 
More efficient 
SELECT ename 
from DEPT D, EMP E 
WHERE e.dept_no = d.dept_no

12. Replace distinct with exists

When submitting a query that contains multiple table information, such as a departmental table and an employee table, avoid using distinct in the SELECT clause, you can generally consider replacing the exist with the

For example:

Low 
-Efficiency Select DISTINCT dept_no, dept_name from 
DEPT D, EMP E 
WHERE d.dept_no = e.dept_no 
 
efficient 
Select D Ept_no, dept_name from 
DEPT D 
WHERE EXISTS (SELECT ' X ' from 
EMP E

exists makes queries faster because the RDBMS core module returns the result immediately after the condition of the subquery is satisfied

13. Use Index to improve efficiency

An index is a conceptual part of a table that is used to improve the efficiency of retrieving data. In fact, SQL Server uses a complex, self-balanced b-tree structure

In general, querying data through an index is faster than full table scans. When SQL Server finds the best path to execute a query and UPDATE statement, the SQL Server optimizer uses the index

Also, using indexes when connecting multiple tables can improve efficiency. Another advantage of using the index is that it provides uniqueness validation for the primary key (primary key)

In addition to those long or long raw data types, you can index almost all of the columns

It is usually especially useful to use indexes in large tables, and, of course, using indexes can also improve efficiency when scanning small tables

Although the use of indexes can improve query efficiency, we must also pay attention to its cost

Indexes require space for storage and regular maintenance, and the index itself is modified whenever a record is added or subtracted from the table or the index column is modified

This means that the insert, DELETE, and update for each record will pay 4 or 5 more disk I/O

Because indexes require additional storage space and processing, unnecessary indexing slows query response times

SQL Server has two modes of access to indexes:

1). Index unique Scan (indexed unique SCAN)

In most cases, the optimizer accesses the index by using the WHERE clause

For example:
Table Lodging has two indexes: a unique index based on the lodging column LODGING_PK and a non-unique index established on the MANAGER column Lodging$manager
SELECT *
From lodging
WHERE lodging = ' ROSE HILL ';

Internally, the above SQL is divided into two steps:

First, the LODGING_PK index is accessed by means of an index-only scan, obtaining the corresponding ROWID, and then performing the next retrieval by ROWID access to the table

If the column returned by the retrieval is included in the index column, SQL Server does not perform the second step (access to the table via ROWID)

Because retrieving data is stored in an index, access to the index alone can fully satisfy the query results

2. Indexed range Query (index range SCAN)

Applies to two situations:

1> retrieval of a range based on uniqueness index

2&GT; retrieval based on a non-unique index

Example 1
SELECT Lodging
From lodging
WHERE lodging like ' m% ';

The WHERE clause condition includes a series of values that SQL Server will query through the index range query lodging_pk

Because the index range query returns a set of values, its efficiency requires that the peso be the only scan lower

Example 2
SELECT Lodging
From lodging
WHERE MANAGER = ' BILL GATES ';

This SQL executes in two steps, Lodging$manager the index range query (gets all the rowid that match the condition Records), and gets the value of the lodging column through the ROWID Access table

Because Lodging$manager is a non unique index, the database cannot perform an index-only scan on it

In the WHERE clause, the index will not be used if the first character Fu Yutung (wildcard) of the value corresponding to the indexed column begins
SELECT Lodging
From lodging
WHERE MANAGER like '%hanman ';

In this case, SQL Server will use a full table scan

14. Avoid using calculations on indexed columns

In the WHERE clause, if the index column is part of a function, the optimizer will use a full table scan without indexing

For example:

Low efficiency
SELECT ...
From DEPT
WHERE SAL * > 25000;

Efficient
SELECT ...
From DEPT
WHERE SAL > 25000/12;

It is important to note that indexing columns are not processed in the search, such as trim,to_date, type conversions, etc., destroying the index and using full table scans to affect SQL execution efficiency

15. Avoid using is null and are NOT NULL on indexed columns

Avoid using any nullable columns in the index, SQL Server will not be able to use the index

For Single-column indexes, this record will not exist in the index if the column contains null values;

For composite indexes, if each column is empty, the record also does not exist in the index. If at least one column is not empty, the record exists in the index

If the uniqueness index is based on the columns A and B of the table, and the a,b value of a record exists in the table (123,null),

SQL Server will not accept the next record insertion with the same a,b value (123,null)

If all the indexed columns are empty, SQL Server will think that the entire key value is empty, and null cannot equal NULL, so you can insert 1000 records with the same key value, of course they are empty!

Because null values do not exist in the indexed column, NULL comparisons of indexed columns in the WHERE clause will cause SQL Server to deactivate the index

Inefficient (index failure)
SELECT ...
From DEPARTMENT
WHERE Dept_code is not NULL

16. Using Union-all and Union

When the SQL statement requires a union of two query result sets, the two result sets are merged in a union-all manner, and then sorted before outputting the final result

If you use UNION ALL instead of union, this sort of order is not necessary and the efficiency will be improved.

It should be noted that union all will repeat the same record in the two result sets, so the feasibility of using union all from the business requirements analysis

Refer to the following experiences for indexing:

1. If you retrieve the number of records in a table with more than 30% data, there is no significant efficiency improvement in using the index

2. In a particular case, using an index may be slower than a full table scan, but this is the same order of magnitude, and typically, using an index is a few times or even thousands of times times faster than a full table scan!

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More