SQL statement Optimization

Last Update:2018-12-07 Source: Internet

Author: User

Tags sorts sql server example

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

SQL statement optimization principles:
1. In Operator
SQL statements written in are easy to write and understand, which is suitable for modern software development.
However, the SQL Performance with in is always relatively low. The difference between SQL with in and SQL without in is analyzed from the execution steps below:
Converts a table to a join of multiple tables. If the conversion fails, execute the subquery in the in statement before querying the outer table records, if the conversion is successful, multiple tables are directly connected for query. It can be seen that at least one conversion process is added to SQL statements using in. General SQL statements can be converted successfully, but SQL statements that contain grouping statistics cannot be converted.
Recommended Solution: avoid using the in operator in business-intensive SQL statements. It can be replaced by exists.
SQL Server example:
Exists usage:
Select * From kj_dept where exists (select * From kj_dept_info where kj_dept.dept_id = dept_id and dept_id = xxx)
In usage:
Select * From kj_dept where dept_id in (select dept_id from kj_dept_info where dept_id = xxx)

2. Not in Operator
This operation is not recommended for strong columns because it cannot apply table indexes.
Recommended Solution: use not exists or (the outer join + is null) instead.

3. <> operator (not equal)
The non-equals operator will never use the index, so the processing of it will only generate a full table scan.
Recommended Solution: use operations with the same functions, such
A <> 0 to a> 0 or a <0
A <> ''To A>''

4. Is null or is not null operation (judge whether the field is empty)
Generally, indexes are not used to determine whether a field is null, because the B-tree index does not have a null index.
Recommended Solution: use operations with the same functions, such
Change a is not null to a> 0 or a>.
Fields are not allowed to be empty, but a default value is used to replace null values. For example, status fields in the application for expansion cannot be empty. The default value is application.

5.> and <operator (greater than or less than operator)
If the value is greater than or less than the operator, you do not need to adjust it. Because it has an index, index search is used, but in some cases it can be optimized. For example, a table has 1 million records, for a numeric field a, 0.3 million records a = 3. Therefore, the effect of executing a> 2 and a> = 3 is very different, because Oracle will first find the record index of 2 and then compare it, when a> = 3, Oracle directly finds the record Index = 3.

6. Like Operator
The like operator can be used for wildcard queries. The wildcard combinations in the like operator can be used for almost any queries. However, poor use may result in performance problems, for example, like '% 100' does not reference the index, while like 'x5400%' references the range index. An actual example: the user ID following the Business ID in the yw_yhjbqk table can be used to query the Business ID yy_bh like '% 100'. This condition will generate a full table scan, if you change to yy_bh like 'x5400% 'or yy_bh like 'b5400%', the index of yy_bh will be used to query the two ranges, and the performance will be greatly improved.

7. Union operator
Union filters out duplicate records after table link. Therefore, after table link, it sorts the generated result sets and deletes duplicate records before returning results. In most applications, duplicate records are not generated. The most common is the union of Process Tables and historical tables. For example:
Select * From gc_dfys
Union
Select * From ls_jg_dfys
This SQL statement extracts the results of two tables at run time, sorts and deletes duplicate records using the sorting space, and finally returns the result set. If the table has a large amount of data, it may cause disk sorting.

Recommended Solution: Use the Union all operator to replace union, because the Union all operation simply merges the two results and returns them.
Select * From gc_dfys
Union all
Select * From ls_jg_dfys

8. Effect of conditional order after where
The conditional order following the WHERE clause directly affects the query of the big data table, for example:
Select * From zl_yhjbqk where dy_dj = '1k' and xh_bz = 1
Select * From zl_yhjbqk where xh_bz = 1 and dy_dj = '1kv below'
In the preceding two SQL statements, the dy_dj and xh_bz fields are not indexed. Therefore, full table scan is performed; in the first SQL statement, the dy_dj = '1kv below 'condition is 99% in the record set, while the xh_bz = 1 condition is only 0.5%, when the first SQL statement is executed, 99% records are compared with dy_dj and xh_bz. When the second SQL statement is executed, 0.5% records are compared with dy_dj and xh_bz, the CPU usage of the second SQL statement is obviously lower than that of the first SQL statement.
The join between tables must be written before other where conditions, and the conditions that can filter out the maximum number of records must be written at the end of the WHERE clause. [the more effective the condition for narrowing the range, the more backward]

9. query the influence of table order
The list order in the table after from will affect the SQL Execution performance. If there is no index, it will be linked according to the order in which the table appears. This is because the table order is incorrect, this will generate a crossover of data that consumes a lot of server resources.

10. Join column
for joined columns, the optimizer does not use an index even if the last join value is a static value. Let's take a look at an example. Suppose there is a employee table (employee). For a employee's surname and name are divided into two columns for storage (first_name and last_name), we want to query a table named Bill. bill Cliton employees.
the following is an SQL statement using join queries:
select * From employss
where first_name = ''and last_name = 'beill cliton ';
the preceding statement can be used to check whether the employee Bill Cliton exists. However, note that the index created by last_name is not used. When the following SQL statement is used, the system can use an index created based on last_name.
select * from employee where first_name = 'beill' and last_name = 'cliton';
what should I do in the following situations? If a variable (name) stores the name of the employee Bill Cliton, how can we avoid full traversal and use indexes in this case? You can use a function to separate the surname and name in the variable name. However, note that this function cannot be used in the index column. The following is an SQL query Script:
select * from employee
where
first_name = substr ('& name', 1, instr (' & name ', '')-1)
and
last_name = substr ('& name', instr (' & name','') + 1)

11. Order by statement
The order by statement determines how the returned query results are sorted. The order by statement has no special restrictions on the columns to be sorted. You can also add functions to the columns (such as joining or appending ). Any non-index item or computed expression in the order by statement will reduce the query speed.
Check the order by statement carefully to find out non-index items or expressions, which will reduce performance. To solve this problem, rewrite the order by statement to use the index. You can also create another index for the column you are using. Avoid using an expression in the order by clause.
You can use the WHERE clause to replace order.

12. Avoid '*'
Avoid using '*' in the select clause. When you want to list all columns in the select clause, using dynamic SQL column reference '*' is a convenient method. unfortunately, this is a very inefficient method. in fact, during the parsing process, '*' is converted into all column names in sequence. This task is done by querying the data dictionary, which means it takes more time.

13. Replace exists with table join
Generally, table join is more efficient than exists.

14. Replace distinct with exists
When you submit a query that contains one-to-many table information (such as the Department table and employee table), avoid using distinct in the select clause. You can generally consider replacing it with exist.

15. Avoid improper index design
Reasonable index design allows every SQL statement to use indexes as much as possible. The in or clause usually uses worksheets to invalidate the index. If a large number of duplicate values are not generated, consider splitting the clause. The split clause should contain the index.
Avoid using not in the index column.

16. Improve SQL efficiency through internal functions
Complex SQL statements tend to sacrifice execution efficiency. It makes sense to understand how internal functions are used to solve problems. If possible, you can use stored procedures for implementation!
However, do not use functions in the index column.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More