1. Select the most efficient table name order (valid only in the Rule-based optimizer)
The SQL Server parser processes the table names in the FROM clause in Right-to-left order, therefore, the last table in the FROM clause (driving table) is processed first, and in the case where multiple tables are included in the FROM clause, the table with the fewest number of records must be selected as the underlying table. When SQL Server processes multiple tables, it uses sorting and merging to connect them.
First, scan the first table (the last table in the FROM clause) and sort the records, then scan the second table (the last second table in the FROM clause), and finally merge all the records retrieved from the second table with the appropriate records in the first table.
For example: Table TAB1 16,384 Records table TAB2 5 records, select TAB2 as the base table (best method) Select COUNT (*) from TAB1,TAB2 execution time 0.96 seconds, select TAB2 as the underlying table (poor method) Select Count (*) from TAB2,TAB1 execution time 26.09 seconds;
If you have more than 3 table join queries, you will need to select the Crosstab table (intersection table) as the underlying table, which is the table referenced by the other tables
For example:
The EMP table describes the intersection of the location table and the category table
SELECT *
from LOCATION L,
CATEGORY C,
EMP E
WHERE e.emp_no BETWEEN 1000 and
e.cat_no = C.cat_ NO
will be more efficient than the following SQL
SELECT *
from EMP E,
LOCATION L,
CATEGORY C
WHERE e.cat_no = c.cat_no and
e.locn = L.LOCN
Connection order in the 2.WHERE clause
SQL Server parses the WHERE clause in a bottom-up order, according to which the connection between the tables must be written before the other where conditions, and the conditions that can filter out the maximum number of records must be written at the end of the WHERE clause
For example:
(Inefficient, execution time 156.3 seconds)
SELECT * FROM
emp E
WHERE SAL > 50000 and
JOB = ' MANAGER '
< (select COUNT (*) from EMP wher E mgr=e.empno);
(Efficient, execution time 10.6 seconds)
SELECT * FROM
emp E
where < (select COUNT (*) from EMP where mgr=e.empno) and
SAL > 50000
avoid the use of ' * ' in the 3.SELECT clause. When you want to list all columns in the SELECT clause, it is convenient to use a dynamic SQL column reference ' * ', which, unfortunately, is a very inefficient approach. In fact, SQL Server, in the process of parsing, converts ' * ' to all column names in turn, which is done by querying the data dictionary, which means it will take more time
4. Reduce the number of accesses to the database. when executing every SQL statement, SQL Server does a lot of work internally: Parsing SQL statements, estimating index utilization, binding variables, reading blocks, and so on
Thus, reducing the number of accesses to the database can actually reduce the workload of SQL Server, for example:
Here are three ways to retrieve employees with an employee number equal to 0342 or 0291
Method 1 (lowest effect)
SELECT Emp_name, SALARY, GRADE from
EMP
WHERE emp_no = 342;
SELECT Emp_name, SALARY, GRADE from
EMP
Method 2 (secondary low efficiency)
DECLARE
CURSOR C1 (e_no number) is
SELECT emp_name,salary,grade from
EMP
WHERE emp_no = e_no;
BEGIN
OPEN C1 (342);
FETCH C1 into ...,...,...;
... OPEN C1 (291);
FETCH C1 into ...,...,...;
... Close C1;
Method 2 (efficient)
SELECT A.emp_name, A.salary, A.grade,
b.emp_name, B.salary, b.grade from
emp A, emp B
WHERE a.emp_no = 342
5. Use the Decode function to reduce processing time
Use the Decode function to avoid repeated scans of the same record or repeated connections to the same table
For example:
SELECT COUNT (*), SUM (SAL) from the
EMP
WHERE dept_no = ' 0020 ' and
ename like ' smith% ';
SELECT COUNT (*), SUM (SAL) from
EMP
WHERE dept_no = ' 0030 '
You can use the Decode function to get the same results efficiently.
SELECT count (DECODE (dept_no, ' 0020 ', ' x ', null)) D0020_count,
count (DECODE (dept_no, ' 0030 ', ' x ', null)) d0030_ COUNT,
sum (DECODE (dept_no, ' 0020 ', Sal, null)) D0020_sal,
sum (DECODE (dept_no, 0030, Sal, null)) D0030_sal From
EMP
' X ' means any one field
Similarly, the Decode function can also be applied to the group BY and ORDER BY clauses
6. Replace the HAVING clause with the WHERE clause
Avoid having a HAVING clause that will filter the result set only after retrieving all records, which requires sorting, statistics, and so on
If you can limit the number of records through the WHERE clause, you can reduce the overhead
For example:
Low efficiency
SELECT REGION, AVG (log_size) from
LOCATION
GROUP by REGION have
REGION REGION '!= ' and
SYDNEY ' REGION ' PERTH '
efficient
SELECT REGION, AVG (log_size) from
LOCATION
WHERE REGION REGION!= ' SYDNEY ' and
REGION!= ' PERTH '
7. Reduce the query to the table
In SQL statements that contain subqueries, pay special attention to reducing the query to the table
For example:
Low
-Efficiency select Tab_name from
TABLES
where tab_name = (select Tab_name from
tab_columns
where VERSION = 604) and
db_ver = (select Db_ver from
tab_columns
WHERE VERSION = 604)
Efficient
SELECT tab_n AME from
the TABLES WHERE (tab_name, db_ver) = (SELECT tab_name, db_ver from
tab_columns
Update multiple column examples:
Low-efficiency
UPDATE EMP
SET emp_cat = (select Max (CATEGORY)
from emp_categories),
sal_range = (select Max (Sal_ RANGE) from
emp_categories)
WHERE emp_dept = 0020;
Efficient
UPDATE EMP
SET (emp_cat, Sal_range) = (SELECT max (CATEGORY), Max (Sal_range) from
emp_categories)
8. Using the alias of the table (alias), when connecting multiple tables in an SQL statement, use the alias of the table and prefix the alias with each column, which reduces the time to parse and reduces the syntax errors caused by column ambiguity
9. Replace in with exists
In many base-table queries, it is often necessary to join another table in order to satisfy one condition
In this case, using EXISTS (or not EXISTS) usually increases the efficiency of the query
Low-Efficiency
SELECT * FROM
EMP (base table)
WHERE EMPNO > 0 and
DEPTNO in (select DEPTNO from
DEPT
where L OC = ' Melb ')
efficient
select * from
EMP (base table)
WHERE EMPNO > 0 and
EXISTS (select ' X ' from
dept< C27/>where DEPT. DEPTNO = EMP. DEPTNO
10. Use NOT exists instead of in
In a subquery, the NOT IN clause performs an internal sort and merge
In either case, not in is the least effective because it performs a full table traversal of the table in the subquery
To avoid using not in, we can rewrite it as an outer join (Outer joins) or not EXISTS
For example:
SELECT ...
From the EMP
WHERE dept_no not in (SELECT dept_no from
DEPT
To improve efficiency rewrite to
High-efficiency
SELECT ...
From EMP A, DEPT B
WHERE a.dept_no = b.dept (+) and
B.dept_no is NULL and
b.dept_cat (+) = ' A '
most efficient
S
elect ... From the EMP E
where not EXISTS (SELECT ' X ' from
DEPT D
where d.dept_no = E.dept_no
11. Replace exists with table connection
In general, table joins are more efficient than exists
For example:
Select ename from
EMP E
where EXISTS (select ' X ' from
DEPT
where dept_no = E.dept_no and
dept_cat = ' A ');
More efficient
SELECT ename
from DEPT D, EMP E
WHERE e.dept_no = d.dept_no
12. Replace distinct with exists
When submitting a query that contains multiple table information, such as a departmental table and an employee table, avoid using distinct in the SELECT clause, you can generally consider replacing the exist with the
For example:
Low
-Efficiency Select DISTINCT dept_no, dept_name from
DEPT D, EMP E
WHERE d.dept_no = e.dept_no
efficient
Select D Ept_no, dept_name from
DEPT D
WHERE EXISTS (SELECT ' X ' from
EMP E
exists makes queries faster because the RDBMS core module returns the result immediately after the condition of the subquery is satisfied
13. Use Index to improve efficiency
An index is a conceptual part of a table that is used to improve the efficiency of retrieving data. In fact, SQL Server uses a complex, self-balanced b-tree structure
In general, querying data through an index is faster than full table scans. When SQL Server finds the best path to execute a query and UPDATE statement, the SQL Server optimizer uses the index
Also, using indexes when connecting multiple tables can improve efficiency. Another advantage of using the index is that it provides uniqueness validation for the primary key (primary key)
In addition to those long or long raw data types, you can index almost all of the columns
It is usually especially useful to use indexes in large tables, and, of course, using indexes can also improve efficiency when scanning small tables
Although the use of indexes can improve query efficiency, we must also pay attention to its cost
Indexes require space for storage and regular maintenance, and the index itself is modified whenever a record is added or subtracted from the table or the index column is modified
This means that the insert, DELETE, and update for each record will pay 4 or 5 more disk I/O
Because indexes require additional storage space and processing, unnecessary indexing slows query response times
SQL Server has two modes of access to indexes:
1). Index unique Scan (indexed unique SCAN)
In most cases, the optimizer accesses the index by using the WHERE clause
For example:
Table Lodging has two indexes: a unique index based on the lodging column LODGING_PK and a non-unique index established on the MANAGER column Lodging$manager
SELECT *
From lodging
WHERE lodging = ' ROSE HILL ';
Internally, the above SQL is divided into two steps:
First, the LODGING_PK index is accessed by means of an index-only scan, obtaining the corresponding ROWID, and then performing the next retrieval by ROWID access to the table
If the column returned by the retrieval is included in the index column, SQL Server does not perform the second step (access to the table via ROWID)
Because retrieving data is stored in an index, access to the index alone can fully satisfy the query results
2. Indexed range Query (index range SCAN)
Applies to two situations:
1> retrieval of a range based on uniqueness index
2> retrieval based on a non-unique index
Example 1
SELECT Lodging
From lodging
WHERE lodging like ' m% ';
The WHERE clause condition includes a series of values that SQL Server will query through the index range query lodging_pk
Because the index range query returns a set of values, its efficiency requires that the peso be the only scan lower
Example 2
SELECT Lodging
From lodging
WHERE MANAGER = ' BILL GATES ';
This SQL executes in two steps, Lodging$manager the index range query (gets all the rowid that match the condition Records), and gets the value of the lodging column through the ROWID Access table
Because Lodging$manager is a non unique index, the database cannot perform an index-only scan on it
In the WHERE clause, the index will not be used if the first character Fu Yutung (wildcard) of the value corresponding to the indexed column begins
SELECT Lodging
From lodging
WHERE MANAGER like '%hanman ';
In this case, SQL Server will use a full table scan
14. Avoid using calculations on indexed columns
In the WHERE clause, if the index column is part of a function, the optimizer will use a full table scan without indexing
For example:
Low efficiency
SELECT ...
From DEPT
WHERE SAL * > 25000;
Efficient
SELECT ...
From DEPT
WHERE SAL > 25000/12;
It is important to note that indexing columns are not processed in the search, such as trim,to_date, type conversions, etc., destroying the index and using full table scans to affect SQL execution efficiency
15. Avoid using is null and are NOT NULL on indexed columns
Avoid using any nullable columns in the index, SQL Server will not be able to use the index
For Single-column indexes, this record will not exist in the index if the column contains null values;
For composite indexes, if each column is empty, the record also does not exist in the index. If at least one column is not empty, the record exists in the index
If the uniqueness index is based on the columns A and B of the table, and the a,b value of a record exists in the table (123,null),
SQL Server will not accept the next record insertion with the same a,b value (123,null)
If all the indexed columns are empty, SQL Server will think that the entire key value is empty, and null cannot equal NULL, so you can insert 1000 records with the same key value, of course they are empty!
Because null values do not exist in the indexed column, NULL comparisons of indexed columns in the WHERE clause will cause SQL Server to deactivate the index
Inefficient (index failure)
SELECT ...
From DEPARTMENT
WHERE Dept_code is not NULL
16. Using Union-all and Union
When the SQL statement requires a union of two query result sets, the two result sets are merged in a union-all manner, and then sorted before outputting the final result
If you use UNION ALL instead of union, this sort of order is not necessary and the efficiency will be improved.
It should be noted that union all will repeat the same record in the two result sets, so the feasibility of using union all from the business requirements analysis
Refer to the following experiences for indexing:
1. If you retrieve the number of records in a table with more than 30% data, there is no significant efficiency improvement in using the index
2. In a particular case, using an index may be slower than a full table scan, but this is the same order of magnitude, and typically, using an index is a few times or even thousands of times times faster than a full table scan!