SQL statement optimization improves database performance and SQL statement database performance

Source: Internet
Author: User
Tags mysql query optimization website performance

SQL statement optimization improves database performance and SQL statement database performance

In systems with unsatisfactory performance, apart from the fact that the application load exceeds the server's actual processing capacity, it is more because the system has a large number of SQL statements that need to be optimized. To achieve stable execution performance, the simpler the SQL statement, the better. Simplify complicated SQL statements.

Common simplified rules are as follows:

1) do not JOIN more than five tables)
2) Consider using temporary tables or table variables to store intermediate results.
3) Use less subqueries
4) view nesting should not be too deep. Generally, view nesting should not exceed 2

I. Question proposal

At the early stage of application system development, due to the relatively small amount of data in the development database, the performance of SQL statement writing is not good for querying SQL statements and writing complex views, however, after the application system is submitted to the actual application, as the data in the database increases, the system response speed becomes one of the most important problems to be solved by the system. An important aspect of system optimization is the optimization of SQL statements. For massive data, the speed difference between inferior SQL statements and high-quality SQL statements can reach hundreds of times. It can be seen that a system can not simply implement its functions, instead, we need to write high-quality SQL statements to improve system availability.

In most cases, Oracle uses indexes to traverse tables faster. The optimizer improves performance based on the defined indexes. However, if the SQL code written in the where clause of the SQL statement is unreasonable, the optimizer will delete the index and use full table scan, in general, such SQL statements are so-called inferior SQL statements. When writing SQL statements, we should be clear about the optimizer's principles for deleting indexes, which helps to write high-performance SQL statements.

Ii. Notes for writing SQL statements

The following describes the issues that need to be paid attention to when writing the where clause of some SQL statements. In these where clauses, even if some columns have indexes, the system cannot use these indexes when running the SQL statement because of poor SQL writing. The full table scan is also used, this greatly reduces the response speed.

1. Operator Optimization

(A) IN Operator

SQL statements written IN are easy to write and understand, which is suitable for modern software development. However, SQL statements using IN always have low performance. The following differences exist between SQL statements using IN and SQL statements without IN:

ORACLE tries to convert it to the join of multiple tables. If the conversion fails, it first executes the subquery IN and then queries the outer table records, if the conversion is successful, multiple tables are directly connected for query. It can be seen that at least one conversion process is added to SQL statements using IN. General SQL statements can be converted successfully, but SQL statements that contain grouping statistics cannot be converted.

Recommended Solution: Use EXISTS instead of the IN operator IN business-intensive SQL statements.

(B) NOT IN Operator

This operation is not recommended for strong columns because it cannot apply table indexes.

Recommended Solution: Use the not exists solution instead

(C) is null or is not null operation (judge whether the field IS empty)

The index is usually not used to determine whether a field is null, because the index does not have a null value. Null cannot be used as an index. Any column containing null values will not be included in the index. Even if there are multiple columns in the index, as long as one of these columns contains null, this column will be excluded from the index. That is to say, if a column has a null value, even if the column is indexed, the performance will not be improved. Any statement optimizer that uses is null or is not null in the where clause cannot use indexes.

Recommended Solution: replace it with other operation operations with the same function, such as changing a is not null to a> 0 or a>. Fields that are not allowed to be empty are replaced by a default value. If the Status field in the application is not allowed to be empty, the default value is apply.

(D)> and <operator (greater than or less than operator)

If the value is greater than or less than the operator, you do not need to adjust it. Because it has an index, index search is used, but in some cases it can be optimized. For example, a table has 1 million records, for A numeric field A, 0.3 million records A = 3. Therefore, the effect of executing A> 2 and A> = 3 is very different, because ORACLE will first find the record index of 2 and then compare it, when A> = 3, ORACLE directly finds the record Index = 3.

(E) LIKE Operator

The LIKE operator can be used for wildcard queries. The wildcard combinations in the LIKE operator can be used for almost any queries. However, poor use may result in performance problems, for example, LIKE '% 100' does not reference the index, while LIKE 'x5400%' references the range index.

An actual example: the user ID following the Business ID in the YW_YHJBQK table can be used to query the Business ID YY_BH LIKE '% 100'. This condition will generate a full table scan, if you change to YY_BH LIKE 'x5400% 'OR YY_BH LIKE 'b5400%', the index of YY_BH will be used to query the two ranges, and the performance will be greatly improved.

The like statement with a wildcard (%:

The above example shows this situation. Currently, You need to query the persons whose names contain cliton in the employee table. The following SQL statement can be used:

select * from employee where last_name like '%cliton%';

Here, because the wildcard (%) appears at the beginning of the search term, the Oracle system does not use the last_name index. This situation may not be avoided in many cases, but it must be well understood. Using wildcard characters will reduce the query speed. However, when a wildcard appears at another position of a string, the optimizer can use the index. In the following query, the index is used:

select * from employee where last_name like 'c%';

(F) UNION operator

UNION filters out duplicate records after table link. Therefore, after table link, it sorts the generated result sets and deletes duplicate records before returning results. In most applications, duplicate records are not generated. The most common is the UNION of Process Tables and historical tables. For example:

select * from gc_dfys union select * from ls_jg_dfys

This SQL statement extracts the results of two tables at run time, sorts and deletes duplicate records using the sorting space, and finally returns the result set. If the table has a large amount of data, it may cause disk sorting.

Recommended Solution: Use the union all operator to replace UNION, because the union all operation simply merges the two results and returns them.

select * from gc_dfys union all select * from ls_jg_dfys

(G) join Column

For joined columns, the optimizer does not use indexes even if the last joined value is a static value. Let's take a look at an example. Suppose there is a employee table (employee). For a employee's surname and name are divided into two columns for storage (FIRST_NAME and LAST_NAME), we want to query a table named Bill. bill Cliton employees.

The following is an SQL statement using join query:

select * from employss where first_name||''||last_name ='Beill Cliton';

The preceding statement can be used to check whether the employee Bill Cliton exists. However, the system optimizer does not use the index created based on last_name. When the following SQL statement is used, the Oracle system can use an index created based on last_name.

where first_name ='Beill' and last_name ='Cliton';

(H) Order by statement

The order by statement determines how Oracle sorts the returned query results. The Order by statement has no special restrictions on the columns to be sorted. You can also add functions to the columns (such as joining or appending ). Any non-index item or computed expression in the Order by statement will reduce the query speed.

Check the order by statement carefully to find out non-index items or expressions, which will reduce performance. To solve this problem, rewrite the order by statement to use the index. You can also create another index for the column you are using. Avoid using an expression in the order by clause.


When querying, we often use some logical expressions in the where clause, such as greater than, less than, equal to, and not equal to. We can also use and (and), or (or) and not (not ). NOT can be used to reverse all logical operators. The following is an example of a NOT clause:

where not (status ='VALID')

If you want to use NOT, brackets should be added before the phrase to be reversed, and the NOT operator should be added before the phrase. NOT operator is included in another logical operator, which is NOT equal to (<>) operator. In other words, even if the NOT word is NOT explicitly added to the query where clause, NOT is still in the operator. See the following example:

where status <>'INVALID';

You can rewrite this query to NOT using NOT:

select * from employee where salary<3000 or salary>3000;

Although the results of these two queries are the same, the second query scheme is faster than the first query scheme. The second query allows Oracle to use indexes for salary columns, while the first query does not.

2. Influence of SQL writing

(A) The impact of different SQL statements on the same function and performance.

For example, if an SQL statement is written by A programmer as Select * from zl_yhjbqk

Programmer B writes Select * from dlyx. zl_yhjbqk (with the table owner prefix)

The C programmer writes Select * from DLYX. ZLYHJBQK (Capital table name)

D programmers write Select * from DLYX. ZLYHJBQK (spaces are added in the middle)

The results and execution time of the preceding four sqls are the same after ORACLE analysis, but the principle of shared memory SGA from ORACLE is as follows, it can be concluded that ORACLE will analyze each SQL statement once and occupy the shared memory. If the SQL string and format are completely the same, ORACLE will analyze it only once, the shared memory also leaves only one analysis result, which not only reduces the time for SQL analysis, but also reduces duplicate information in the shared memory. ORACLE can also accurately count the execution frequency of SQL.

(B) Effect of conditional order after WHERE

The conditional order after the WHERE clause directly affects the query of the big data table. For example:

Select * from zl_yhjbqk where dy_dj = '1k' and xh_bz = 1 Select * from zl_yhjbqk where xh_bz = 1 and dy_dj = '1k'

In the preceding two SQL statements, the dy_dj and xh_bz fields are not indexed. Therefore, full table scan is performed, in the first SQL statement, the dy_dj = '1kv below 'condition is 99% in the record set, while the xh_bz = 1 condition is only 0.5%, when the first SQL statement is executed, 99% records are compared with dy_dj and xh_bz. When the second SQL statement is executed, 0.5% records are compared with dy_dj and xh_bz, the CPU usage of the second SQL statement is obviously lower than that of the first SQL statement.

(C) query the influence of table order

The list order in the table after FROM will affect the SQL Execution performance. If there is no index and ORACLE does not perform statistical analysis on the table, ORACLE will link according to the order in which the table appears, it can be seen that when the table order is incorrect, data that consumes a lot of server resources will be generated. (Note: If statistical analysis is performed on the table, ORACLE will automatically link the small table to the large table)

3. SQL statement index Utilization

(A) Optimization of condition fields

Fields processed by functions cannot use indexes, for example:

Substr (hbs_bh, 5400) = '000000', optimization processing: hbs_bh like '000000' trunc (sk_rq) = trunc (sysdate), optimization processing: sk_rq> = trunc (sysdate) and sk_rq <trunc (sysdate + 1)

Fields with explicit or implicit operations cannot be indexed, for example, ss_df + 20> 50. Optimization: ss_df> 30

'X' | hbs_bh> 'x5400021452 '. Optimization: hbs_bh> '123'
Sk_rq + 5 = sysdate, optimized: sk_rq = sysdate-5
Hbs_bh = 5401002554, optimization processing: hbs_bh = '000000'. Note: This condition implicitly converts hbs_bh to to_number, because the hbs_bh field is in bytes type.

Fields in multiple tables cannot be indexed, for example:

Ys_df> cx_df, cannot be optimized
Qc_bh | kh_bh = '000000', optimization: qc_bh = '000000' and kh_bh = '000000'

4. More information about SQL Optimization

(1) select the most efficient table name sequence (only valid in the rule-based Optimizer ):

The ORACLE parser processes the table names in the FROM clause in the order FROM right to left. The table written in the FROM clause (basic table driving table) will be processed first, when the FROM clause contains multiple tables, You must select the table with the least number of records as the base table. If more than three tables are connected for query, You need to select an intersection table as the base table, which is the table referenced by other tables.

(2) join order in the WHERE clause:

ORACLE uses the bottom-up sequence to parse the WHERE clause. According to this principle, the join between tables must be written before other WHERE conditions. The conditions that can filter out the maximum number of records must be written at the end of the WHERE clause.

(3) Avoid using '*' in the SELECT clause '*':

During the parsing process, ORACLE converts '*' into all column names in sequence. This is done by querying the data dictionary, which means it takes more time.

(4) Reduce the number of visits to the database:

ORACLE has performed a lot of internal work: parsing SQL statements, estimating index utilization, binding variables, and reading data blocks.

(5) re-set the ARRAYSIZE parameter in SQL * Plus, SQL * Forms, and Pro * C to increase the retrieval data volume for each database access. The recommended value is 200.

(6) use the DECODE function to reduce processing time:

You can use the DECODE function to avoid repeated scan of the same record or join the same table.

(7) simple integration with no associated database access:

If you have several simple database query statements, you can integrate them into a single query (even if there is no relationship between them ).

(8) delete duplicate records:

The most efficient way to delete duplicate records (because ROWID is used) is as follows:

Delete from emp e where e. ROWID> (select min (X. ROWID) from emp x where x. EMP_NO = E. EMP_NO ).

(9) replace DELETE with TRUNCATE:

When deleting records in a table, a rollback segment is usually used to store information that can be recovered. if you do not have a COMMIT transaction, ORACLE will recover the data to the State before the deletion (which is precisely the State before the deletion command is executed). When TRUNCATE is used, rollback segments no longer store any recoverable information. after the command is run, the data cannot be restored. therefore, few resources are called and the execution time is short. (The translator Press: TRUNCATE applies only to deleting the entire table, and TRUNCATE is DDL rather than DML ).

(10) Try to use COMMIT as much as possible:

As long as possible, use COMMIT as much as possible in the program, so that the program performance is improved, the demand will also be reduced because of the resources released by COMMIT, the resources released by COMMIT:

A. Information used to restore data on the rollback segment.
B. Locks obtained by Program Statements
C. Space in redo log buffer
D. ORACLE manages the internal costs of the above three types of resources

(11) replace HAVING clause with the Where clause:

Avoid using the HAVING clause. HAVING filters the result set only after all records are retrieved. this process requires sorting, total, and other operations. if the WHERE clause can be used to limit the number of records, this overhead can be reduced. (in non-oracle) where on, where, and having can be added, on is the first statement to execute, where is the second clause, and having is the last clause, because on filters out records that do not meet the conditions before making statistics, it can reduce the data to be processed by intermediate operations. It is reasonable to say that the speed is the fastest, where should also be faster than having, because it performs sum only after filtering data, and on is used only when two tables are joined, so in a table, then we can compare where with having. In the case of single-Table query statistics, if the filter condition does not involve fields to be calculated, the results will be the same, but the where technology can be used, having cannot. The latter must be slow in terms of speed. If it involves a calculated field, it means that the value of this field is uncertain before calculation, according to the workflow written in the previous article, the where function is completed before computing, and having is used only after computing. In this case, the results are different. In multi-table join queries, on takes effect earlier than where. The system first combines multiple tables into a temporary table based on the join conditions between tables, then filters them by where, then computes them, and then filters them by having after calculation. It can be seen that to filter a condition to play a correct role, you must first understand when the condition should take effect and then decide to put it there.

(12) Reduce table queries:

In SQL statements containing subqueries, pay special attention to reducing the number of queries to the table. Example:


(13) Improve SQL efficiency through internal functions:

Complex SQL statements tend to sacrifice execution efficiency. It is very meaningful to grasp the above methods to solve problems by using functions.

(14) use the table Alias (Alias ):

When connecting multiple tables in an SQL statement, use the table alias and prefix the alias on each Column. in this way, the parsing time can be reduced and the syntax errors caused by Column ambiguity can be reduced.

(15) Replace IN with EXISTS and not exists instead of not in:

In many basic table-based queries, to meet one condition, you often need to join another table. in this case, using EXISTS (or not exists) usually improves the query efficiency. IN a subquery, the not in Clause executes an internal sorting and merging. IN either case, not in is the most inefficient (because it executes a full table traversal for the table IN the subquery ). to avoid using not in, we can rewrite it into an Outer join (Outer Joins) or not exists.


(Efficient) SELECT * from emp (basic table) where empno> 0 and exists (SELECT 'x' from dept where dept. DEPTNO = EMP. deptno and loc = 'melb') (inefficient) SELECT * from emp (basic table) where empno> 0 and deptno in (select deptno from dept where loc = 'melb ')

(16) Identifying 'inefficient execution' SQL statements:

Although a variety of graphical tools for SQL optimization are emerging, writing your own SQL tools is always the best way to solve the problem:


(17) using indexes to improve efficiency:

An index is a conceptual part of a table to improve data retrieval efficiency. ORACLE uses a complex self-balancing B-tree structure. data Query by index is usually faster than full table scan. when ORACLE finds the optimal path for executing the query and Update statements, the ORACLE optimizer uses the index. using indexes when joining multiple tables can also improve efficiency. another advantage of using an index is that it provides uniqueness verification for the primary key .. For those LONG or long raw data types, You Can index almost all columns. generally, using indexes in large tables is particularly effective. of course, you will also find that using indexes to scan small tables can also improve efficiency. although the index can improve the query efficiency, we must pay attention to its cost. the index requires space for storage and regular maintenance. The index itself is also modified whenever a record is increased or decreased in the table or the index column is modified. this means that the INSERT, DELETE, and UPDATE operations for each record will pay four or five more disk I/O. because indexes require additional storage space and processing, unnecessary indexes will slow the query response time .. Regular index reconstruction is necessary:


(18) replace DISTINCT with EXISTS:

When you submit a query that contains one-to-many table information (such as the Department table and employee table), avoid using DISTINCT in the SELECT clause. in general, you can consider replacing it with EXIST, and EXISTS makes the query more rapid, because the RDBMS core module will return the result immediately after the subquery conditions are met. example:

(Inefficient): select distinct DEPT_NO, DEPT_NAME from dept d, emp e where d. DEPT_NO = E. DEPT_NO (efficient): SELECT DEPT_NO, DEPT_NAME from dept d where exists (SELECT 'x' from emp e where e. DEPT_NO = D. DEPT_NO );

(19) SQL statements are written in uppercase, because oracle always parses SQL statements first, converts lowercase letters to uppercase and then executes them.

(20) try to use the connector "+" to connect strings in java code!

(21) Avoid using NOT in the index column. Generally, we should avoid using NOT in the index column. NOT will have the same effect as using the function in the index column. when ORACLE Encounters "NOT", it stops using indexes and then performs full table scanning.

(22) Avoid using computation on index Columns

In the WHERE clause, if the index column is part of the function, the optimizer will use full table scan without using the index. For example:

Inefficient: SELECT... From dept where sal * 12> 25000; efficiency: SELECT... From dept where sal> 25000/12; (23) Replace with> => efficiency: SELECT * from emp where deptno> = 4 inefficiency: SELECT * from emp where deptno> 3

The difference between the two lies in that the former DBMS will directly jump to the first record with DEPT equal to 4, while the latter will first locate the record with DEPTNO = 3 and scan forward to the record with the first DEPT greater than 3.

(24) replace OR with UNION (applicable to index columns)

In general, replacing OR in the WHERE clause with UNION will produce better results. using OR for index columns will scan the entire table. note that the preceding rules are only valid for multiple index columns. if a column is not indexed, the query efficiency may be reduced because you did not select OR. in the following example, both LOC_ID and REGION have indexes.

Efficient: SELECT LOC_ID, LOC_DESC, region from location where LOC_ID = 10 union select LOC_ID, LOC_DESC, region from location where region = "MELBOURNE" inefficient: SELECT LOC_ID, LOC_DESC, region from location where LOC_ID = 10 or region = "MELBOURNE"

If you insist on using OR, you need to write the index columns with the least records at the beginning.

(25) use IN to replace OR

This is a simple and easy-to-remember rule, but the actual execution results must be tested. in ORACLE8i, the execution paths of the two seem to be the same.

Inefficient: SELECT .... From location where LOC_ID = 10 OR LOC_ID = 20 OR LOC_ID = 30 efficient SELECT... From location where LOC_IN IN (10, 20, 30 );

(26) Avoid using is null and is not null in the index column.

To avoid using any columns that can be empty in the index, ORACLE will not be able to use this index. this record does not exist in the index if the column contains a null value. for a composite index, if each column is empty, this record does not exist in the index. if at least one column is not empty, the record is stored in the index. for example, if the unique index is created in column A and column B of the table, and the and B values of A record exist in the table are (123, null ), ORACLE will not accept the next record with the same A, B value (123, null) (insert ). however, if all index columns are empty, ORACLE considers the entire key value to be null, but null is not equal to null. therefore, you can insert 1000 records with the same key value. Of course, they are empty! Because the null value does not exist in the index column, the Null Value Comparison of the index column in The WHERE clause will disable ORACLE.

Inefficient: (index failure) SELECT... From department where DEPT_CODE is not null; efficient: (index valid) SELECT... From department where DEPT_CODE> = 0;

(27) always use the first column of the index:

If an index is created on multiple columns, the optimizer selects this index only when its first column (leading column) is referenced by the where clause. this is also a simple and important rule. When only the second column of the index is referenced, the optimizer uses a full table scan and ignores the index.

(28) replace UNION with UNION-ALL (if possible ):

When an SQL statement needs to UNION two query result sets, these two result sets are merged in the form of UNION-ALL and sorted before the final result is output. if union all is used to replace UNION, sorting is unnecessary. the efficiency will be improved accordingly. note that union all will repeatedly output the same records in the two result sets. therefore, you still need to analyze the feasibility of using union all from the business needs. UNION sorts the result set. This operation uses SORT_AREA_SIZE memory. this memory optimization is also very important. the following SQL can be used to query the consumption of sorting


(29) replace order by with WHERE:

The order by clause only uses indexes under two strict conditions.
All columns in order by must be included in the same index and maintained in the ORDER of the index.
All columns in order by must be defined as non-empty.
The index used BY the WHERE clause and the index used in the order by clause cannot be tied together.

For example:

The DEPT table contains the following columns:


Inefficient: (the index is not used) SELECT DEPT_CODE from dept order by DEPT_TYPE efficient: (using the index) SELECT DEPT_CODE from dept where DEPT_TYPE> 0

(30) Avoid changing the index column type:

ORACLE automatically converts columns to different types of data.
Assume that EMPNO is a numeric index column.


In fact, after ORACLE type conversion, the statement is converted:


Fortunately, the type conversion does not occur on the index column, and the purpose of the index is not changed.
Assume that EMP_TYPE is a character-type index column.


This statement is converted:


This index will not be used because of internal type conversion! To avoid implicit type conversion for your SQL statements, it is best to explicitly display the type conversion. note that when comparing characters and values, ORACLE converts the value type to the character type first.


select emp_name form employee where salary > 3000

In this statement, if salary is of the Float type, the optimizer optimizes it to Convert (float, 3000) Because 3000 is an integer, we should use 3000.0 during programming instead of converting the DBMS during runtime. Conversion of the same character and integer data.

(31) WHERE clause to be careful:

The WHERE clause in some SELECT statements does not use indexes. Here are some examples.
In the example below, (1 )'! = 'No index is used. remember, indexes only tell you what exists in the table, but not what does not exist in the table. (2) 'character connection' is a character concatenation function. as with other functions, indexes are disabled. (3) '+' is a mathematical function. as with other mathematical functions, indexes are disabled. (4) The same index Columns cannot be compared with each other, which enables full table scan.

(32). if the number of records in a table with more than 30% data records is retrieved. using indexes will not significantly improve the efficiency. b. in certain cases, using indexes may be slower than full table scanning, but this is an order of magnitude difference. in general, using an index is several times or even several thousand times more than a full table scan!

(33) Avoid resource-consuming operations:

SQL statements with DISTINCT, UNION, MINUS, INTERSECT, and order by enable the SQL engine to execute resource-consuming sorting (SORT. DISTINCT requires a sorting operation, while other operations require at least two sorting operations. generally, SQL statements with UNION, MINUS, and INTERSECT can be rewritten in other ways. if your database's SORT_AREA_SIZE is well configured, you can also consider using UNION, MINUS, and INTERSECT. After all, they are highly readable.

(34) Optimize group:

To improve the efficiency of the group by statement, you can filter out unnecessary records before group by. The following two queries return the same results, but the second query is much faster.

Inefficient: select job, AVG (SAL) from emp group by job having job = 'President 'or job = 'manager' efficiency: select job, AVG (SAL) from emp where job = 'President 'or job = 'manager' GROUP by JOB

Articles you may be interested in:
  • Enable database cache dependency to optimize website performance
  • SQL Server optimizes SQL statements in and not in
  • SQL statement Optimization Methods: 30 examples (recommended)
  • Experience in optimizing SQL Server databases with High Performance
  • How to quickly insert large data volumes in MySQL and optimize statements
  • MySQL performance optimization tips help your database
  • Discussion on pl/SQL batch processing statements: Contribution of BULK COLLECT and FORALL to Optimization
  • MySQL query optimization: Connection query sorting limit (join, order by, limit statements)
  • 30 common methods for optimizing SQL statement queries in MySQL
  • How to optimize SQL statements

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.