SQL Server performance optimization skills, SQL Server Performance Optimization

Source: Internet
Author: User

SQL Server performance optimization skills, SQL Server Performance Optimization

1. Select the most efficient table name sequence (only valid in the rule-based Optimizer)

The SQL Server parser processes the table names in the FROM clause in the order FROM right to left. Therefore, the table written in the FROM clause (basic table driving table) will be processed first, when the FROM clause contains multiple tables, the table with the least number of records must be selected as the base table. When SQLSERVER processes multiple tables, connect them by sorting and merging,

First, scan the first table (the last table in the FROM clause) and sort the records; then scan the second table (the last second table in the FROM clause ); finally, merge all records retrieved from the second table with the appropriate records in the first table.
For example, for TAB1 16,384 records, TAB2 5 Records, select TAB2 as the basic table (the best method) select count (*) from tab1, and tab2 takes 0.96 seconds to execute, select TAB2 as the basic table (poor method) select count (*) from tab2, tab1 execution time 26.09 seconds;

If more than three tables are connected for query, You need to select an intersection table as the base table. A cross table is the table referenced by other tables.

For example:

The EMP table describes the intersection between the LOCATION table and the CATEGORY table.

SELECT * FROM LOCATION L, CATEGORY C, EMP E WHERE E.EMP_NO BETWEEN 1000 AND 2000 AND E.CAT_NO = C.CAT_NO AND E.LOCN = L.LOCN 

It will be more efficient than the following SQL statements

SELECT * FROM EMP E , LOCATION L , CATEGORY C WHERE E.CAT_NO = C.CAT_NO AND E.LOCN = L.LOCN AND E.EMP_NO BETWEEN 1000 AND 2000 

2. Connection sequence in the WHERE clause

SQLSERVER uses the bottom-up sequence to parse the WHERE clause. Based on this principle, the join between tables must be written before other WHERE conditions. The conditions that can filter out the maximum number of records must be written at the end of the WHERE clause.
For example:
(Inefficient, execution time: 156.3 seconds)

SELECT * from emp e where sal> 50000 and job = 'manager' AND 25 <(select count (*) from emp where mgr = E. EMPNO); (efficient, execution time: 10.6 seconds) SELECT * from emp e where 25 <(select count (*) from emp where mgr = E. EMPNO) and sal> 50000 and job = 'manager ';

3. Avoid '*' in the SELECT clause '*'. When you want to list all columns in the SELECT clause, using dynamic SQL COLUMN reference '*' is a convenient method. Unfortunately, this is a very inefficient method. In fact, SQLSERVER converts '*' into all column names in sequence during parsing. This task is done by querying the data dictionary, which means it takes more time.


4. Reduce the number of database accesses.When each SQL statement is executed, SQLSERVER executes a lot of work internally: parsing SQL statements, estimating index utilization, binding variables, and reading data blocks.
It can be seen that reducing the number of visits to the database can actually reduce the workload of SQLSERVER, for example:
There are three ways to retrieve employees with employee numbers equal to 0342 or 0291
Method 1 (most inefficient)

SELECT EMP_NAME, SALARY, GRADE FROM EMP WHERE EMP_NO = 342;  SELECT EMP_NAME, SALARY, GRADE FROM EMP WHERE EMP_NO = 291; 

Method 2 (low efficiency)

DECLARE CURSOR C1 (E_NO NUMBER) IS SELECT EMP_NAME,SALARY,GRADE FROM EMP WHERE EMP_NO = E_NO; BEGIN OPEN C1(342); FETCH C1 INTO …,…,…; … OPEN C1(291); FETCH C1 INTO …,…,…; … CLOSE C1; END; 

Method 2 (efficient)

SELECT A.EMP_NAME, A.SALARY, A.GRADE, B.EMP_NAME, B.SALARY, B.GRADE FROM EMP A, EMP B WHERE A.EMP_NO = 342 AND B.EMP_NO = 291; 

5. Use the DECODE function to reduce processing time

Use the DECODE function to avoid repeated scan of the same record or join the same table.

For example:

SELECT COUNT(*), SUM(SAL) FROM EMP WHERE DEPT_NO = '0020' AND ENAME LIKE 'SMITH%';  SELECT COUNT(*), SUM(SAL) FROM EMP WHERE DEPT_NO = '0030' AND ENAME LIKE 'SMITH%'; 

You can use the DECODE function to efficiently get the same result.

SELECT COUNT(DECODE(DEPT_NO, '0020', 'X', NULL)) D0020_COUNT, COUNT(DECODE(DEPT_NO, '0030', 'X', NULL)) D0030_COUNT, SUM(DECODE(DEPT_NO, '0020', SAL, NULL)) D0020_SAL, SUM(DECODE(DEPT_NO, 0030, SAL, NULL)) D0030_SAL FROM EMP WHERE ENAME LIKE 'SMITH%'; 

'X' indicates any field
Similarly, the DECODE function can be used in the group by and order by clauses.

6. Replace HAVING clause with Where clause

Avoid using the HAVING clause. HAVING filters the result set only after all records are retrieved. This processing requires sorting, statistics, and other operations.

If the WHERE clause can be used to limit the number of records, this overhead can be reduced.

For example:
Inefficient

Select region, AVG (LOG_SIZE) from location group by region having region! = 'Sydney 'and region! = 'Perth' efficient select region, AVG (LOG_SIZE) from location where region! = 'Sydney 'and region! = 'Perth' GROUP BY REGION

7. Reduce table queries

In SQL statements containing subqueries, pay special attention to reducing the number of queries to the table.
For example:

Inefficient SELECT TAB_NAME from tables where TAB_NAME = (SELECT TAB_NAME FROM TAB_COLUMNS where version = 604) AND DB_VER = (SELECT DB_VER FROM TAB_COLUMNS where version = 604) Efficient SELECT TAB_NAME from tables where (TAB_NAME, DB_VER) = (SELECT TAB_NAME, DB_VER FROM TAB_COLUMNS where version = 604)

Example of updating multiple columns:

Inefficient update emp set EMP_CAT = (select max (CATEGORY) FROM EMP_CATEGORIES), SAL_RANGE = (select max (SAL_RANGE) FROM EMP_CATEGORIES) WHERE EMP_DEPT = 0020; efficient update emp set (EMP_CAT, SAL_RANGE) = (select max (CATEGORY), MAX (SAL_RANGE) FROM EMP_CATEGORIES) WHERE EMP_DEPT = 0020;

8. Use the table Alias (Alias ),When connecting multiple tables in an SQL statement, use the table alias and prefix the alias on each Column, this reduces the parsing time and syntax errors caused by Column ambiguity.

9. Replace IN with EXISTS

In many basic table-based queries, to meet one condition, you often need to join another table.

In this case, using EXISTS (or not exists) usually improves the query efficiency.

Inefficient SELECT * from emp (basic table) where empno> 0 and deptno in (select deptno from dept where loc = 'melb') Efficient SELECT * from emp (basic table) where empno> 0 and exists (SELECT 'x' from dept where dept. DEPTNO = EMP. deptno and loc = 'melb ')

10. replace not in with NOT EXISTS

IN a subquery, the not in Clause executes an internal sorting and merging.

IN either case, not in is the most inefficient because it executes a full table traversal for the table IN the subquery.

To avoid the use of not in, we can rewrite it into an Outer join (Outer Joins) or NOT EXISTS

For example:

SELECT … FROM EMP WHERE DEPT_NO NOT IN (SELECT DEPT_NO FROM DEPT WHERE DEPT_CAT = 'A'); 

To improve efficiency, rewrite

Efficient SELECT... From emp a, dept B where a. DEPT_NO = B. DEPT (+) and B. DEPT_NO is null and B. DEPT_CAT (+) = 'A' most efficient SELECT... From emp e where not exists (SELECT 'x' from dept d where d. DEPT_NO = E. DEPT_NO AND DEPT_CAT = 'A ');

11. Replace EXISTS with table join

Generally, table join is more efficient than EXISTS.

For example:

Select ename from emp e where exists (SELECT 'x' from dept where DEPT_NO = E. DEPT_NO AND DEPT_CAT = 'A'); select ename from dept d, emp e where e. DEPT_NO = D. DEPT_NO AND DEPT_CAT = 'a ';

12. Replace DISTINCT with EXISTS

When you submit a query that contains multi-table information (such as the Department table and employee table), avoid using DISTINCT in the SELECT clause. Generally, you can consider replacing it with EXIST.

For example:

Inefficient select distinct DEPT_NO, DEPT_NAME from dept d, emp e where d. DEPT_NO = E. DEPT_NO efficient SELECT DEPT_NO, DEPT_NAME from dept d where exists (SELECT 'x' from emp e where e. DEPT_NO = D. DEPT_NO );

EXISTS makes the query more rapid, because the core module of RDBMS will return results immediately after the subquery conditions are met.

13. Use indexes to improve efficiency

An index is a conceptual part of a table to improve the efficiency of data retrieval. In fact, SQLSERVER uses a complex self-balancing B-tree Structure

Data Query by index is usually faster than full table scan. When SQLSERVER finds the optimal path for executing the query and Update statements, SQLSERVER optimizer uses the index

Similarly, using indexes to join multiple tables can also improve efficiency. Another advantage of using an index is that it provides uniqueness verification of the primary key.

Except for those LONG or long raw data types, You Can index almost all columns.

Using indexes in large tables is particularly effective. Of course, using indexes to scan small tables also improves efficiency.

Although index can improve the query efficiency, we must pay attention to its cost.

The index requires space for storage and regular maintenance. The index itself is also modified whenever a record is increased or decreased in the table or the index column is modified.

This means that the INSERT, DELETE, and UPDATE operations for each record will pay four or five more disk I/O operations.

Because the index requires additional storage space and processing, the query response time will be slowed down by unnecessary indexes.

SQLSERVER has two access modes for indexes:

1) UNIQUE INDEX SCAN)

In most cases, the optimizer accesses the INDEX through the WHERE clause

For example:
The table LODGING has two indexes: the unique index LODGING_PK on the LODGING column and the non-unique index LODGING on the MANAGER column $ MANAGER
SELECT *
FROM LODGING
Where lodging = 'Rose HILL ';

Internally, the preceding SQL statement is executed in two steps:

First, the LODGING_PK index will be accessed by means of unique index scanning to obtain the corresponding ROWID. Then, the ROWID will be used to access the table and perform the next retrieval.

If the returned columns are included in the INDEX column, SQLSERVER will not perform the second step (access the table through ROWID)

Because the retrieval data is stored in the index, accessing the index alone can fully meet the query results.

2). index range query (index range scan)

Applicable to two situations:

1>. Search for a range based on the unique index

2>. Non-unique index-based retrieval

Example 1
SELECT LODGING
FROM LODGING
Where lodging like'm % ';

The WHERE clause condition includes a series of values. SQLSERVER will query LODGING_PK through index range query.

The index range query returns a set of values, which is less efficient than the unique index scan.

Example 2
SELECT LODGING
FROM LODGING
Where manager = 'bill Gates ';

The SQL statement is executed in two steps. LODGING $ MANAGER queries the index range (obtain the ROWID of all matching records) and obtains the value of the LODGING column through the ROWID access table.

Because LODGING $ MANAGER is a non-unique index, the database cannot perform a unique index scan on it.

In the WHERE clause, if the first character of the value corresponding to the index column starts with the WILDCARD (WILDCARD), the index will not be used
SELECT LODGING
FROM LODGING
Where manager like '% HANMAN ';

In this case, SQLSERVER uses full table Scan

14. Avoid using computation on index Columns

In the WHERE clause, if the index column is part of the function, the optimizer uses full table scan without using the index.

For example:

Inefficient
SELECT...
FROM DEPT
Where sal * 12> 25000;

Efficient
SELECT...
FROM DEPT
Where sal> 25000/12;

Do not process index columns in the search, such as TRIM, TO_DATE, and type conversion. If the index is damaged, full table scan will affect the SQL Execution efficiency.

15. Avoid using is null and is not null in the index column.

To avoid using any columns that can be empty in the index, SQLSERVER will not be able to use the index

This record does not exist in the index if the column contains a null value;

For a composite index, if each column is empty, this record does not exist in the index. If at least one column is not empty, the record is stored in the index.

If the unique index is created in column A and column B of the table, and there is A record in the table, the value of A and B is (123, null ),

SQLSERVER will not accept the insert of the next record with the same A and B values (123, null)

If all index columns are empty, SQLSERVER considers that the entire key value is null, and null cannot be equal to null. Therefore, you can insert 1000 records with the same key value, of course they are all empty!

Because the null value does not exist in the index column, the SQL Server disables the index by comparing the null value of the index column in The WHERE clause.

Inefficient (index failure)
SELECT...
FROM DEPARTMENT
WHERE DEPT_CODE IS NOT NULL

16. Use UNION-ALL and UNION

When the SQL statement needs to UNION two query result sets, these two result sets will be merged in the UNION-ALL method, and then sorted before the final result is output.

If union all is used to replace UNION, this sorting is not necessary and the efficiency will be improved.

It should be noted that union all will repeatedly output the same records in the two result sets, so it is still necessary to analyze the feasibility of using union all from the business needs.

For more information about indexes, see:

1) if you retrieve more than 30% of the records in a table, using indexes will not significantly improve the efficiency.

2 ). in certain circumstances, using indexes may be slower than full table scanning, but this is a gap of the same order of magnitude. In general, using indexes is several or even several thousand times faster than full table scanning!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.