A Free Trial That Lets You Build Big!
Start building with 50+ products and up to 12 months usage for Elastic Compute Service
In the early stage of application system development, due to less development database data, the query SQL statement, complex view of the writing of the performance of the SQL statement is not good or bad, but if the application system submitted to the actual application, as the data in the database increases, The response speed of the system is one of the most important problems that the system needs to solve at present. An important aspect of system optimization is the optimization of SQL statements. For the massive data, the speed difference between the inferior SQL statement and the high-quality SQL statement can reach hundreds of times, it can be seen that a system is not simply able to achieve its function, but to write high-quality SQL statements, improve the availability of the system.
in most cases, Oracle uses indexes to traverse tables more quickly, and the optimizer improves performance primarily based on defined indexes . However, if the SQL code written in the WHERE clause of the SQL statement is not reasonable, it will cause the optimizer to delete the index and use a full table scan , which is generally referred to as the poor SQL statement. When writing SQL statements, we should be aware of the principles by which the optimizer removes the index, which helps to write high-performance SQL statements.Second, the SQL statement writing attention issues
The following is a detailed description of the issues that need to be noted in writing the WHERE clause of some SQL statements. In these where clauses, even if there are indexes on some columns, because poor SQL is written, the system cannot use the index while running the SQL statement, and it also uses a full table scan, which results in a very slow response.
1. Operator optimization
(a) in operator
The advantages of SQL in write are easier to write and easy to understand, which is more suitable for modern software development style. But SQL performance with in is always lower, and the steps taken from Oracle to parse SQL with in is the following differences from SQL without in:
Oracle attempts to convert it into a connection to multiple tables, and if the conversion is unsuccessful, it executes the subquery in the inside, then queries the outer table record, and if the conversion succeeds, it directly uses the connection method of multiple tables. This shows that using in SQL at least one more conversion process. General SQL can be converted successfully, but for the inclusion of grouping statistics and other aspects of SQL cannot be converted.
recommended Scenario: in a business-intensive SQL, try not to use the in operator, instead of using the EXISTS scheme.
(b) Not in operator
This action is not recommended for strong columns because it cannot apply the index of the table.
Recommended scenario: Replace with not EXISTS scheme
(c) is null or is not NULL operation (determines whether the field is empty)
Determining whether a field is empty generally does not apply an index because the index is not an index null value. You cannot use NULL as an index, and any column that contains null values will not be included in the index. Even if the index has more than one column, the column is excluded from the index as long as there is a column in the column that contains null. This means that if a column has a null value, even indexing the column does not improve performance. Any statement optimizer that uses is null or is not NULL in the WHERE clause is not allowed to use the index.
Recommended scenario : Replace with other operations with the same function, such as: A is not null changed to A>0 or a> ", etc. The field is not allowed to be empty, but instead of a null value with a default value, such as the Status field in the requisition is not allowed to be empty, the default is the request.
(d) > and < operator (greater than or less than operator)
The greater than or less than the operator generally does not need to adjust, because it has an index will be indexed to find, but in some cases it can be optimized, such as a table has 1 million records, a numeric field A, 300,000 records of a=0,30 Records of the A=1,39 million records of a=2,1 Records of the a=3. There is a big difference between performing a>2 and a>=3, because Oracle finds the index of records for 2 and then compares them, while A>=3 Oracle locates the records index of =3 directly.
(e) Like operator
The LIKE operator can apply a wildcard query, where the wildcard combination may reach almost arbitrary queries, but if used poorly it can produce performance problems, such as the "%5400%" query does not reference the index, and the "x5400%" reference to the scope index.
A practical example: Use the user identification number behind the business number in the YW_YHJBQK table to query the business number YY_BH like '%5400% ' this condition will result in a full table scan, if changed to yy_bh like ' x5400% ' or yy_bh like ' b5400% ' will benefit The performance of the two-range query with YY_BH Index is certainly greatly improved.
A like statement with a wildcard character (%):
This is also the case with the above example. The current demand is such that the workers ' table should be queried for the person whose name contains Cliton. You can use the following query SQL statement:
SELECT * from the employee where last_name like '%cliton% ';
This is because the wildcard character (%) appears at the beginning of the search term, so the Oracle system does not use the last_name index. In many cases it may not be possible to avoid this, but be sure to be in the bottom of your mind, so using a wildcard will slow down the query. However, when wildcards appear elsewhere in a string, the optimizer can take advantage of the index. The indexes are used in the following query:
SELECT * from the employee where last_name like ' c% ';
(f) UNION operator
The Union will filter out duplicate records after the table link is made, so the resulting set of results will be sorted after the table is connected, the duplicate records are deleted and the results returned. Most of the actual applications do not produce duplicate records, the most common being the process table and the History table Union. Such as:
SELECT * FROM Gc_dfys Union SELECT * FROM Ls_jg_dfys
This SQL takes out the results of two tables at run time, then sorts the duplicate records with the sort space, and finally returns the result set, which may cause the disk to be sorted if the table data volume is large.
Recommended Scenario: Use the union ALL operator instead of union because the union all operation simply merges two results and returns.
SELECT * FROM Gc_dfys UNION ALL SELECT * FROM Ls_jg_dfys
(g) Join columns
For a joined column, the optimizer does not use the index, even if the last join value is a static value. Let's take a look at an example, assuming that there is a staff table (employee), for a worker's surname and name in two columns (First_Name and last_name), now to query a Bill Clinton Cliton.
Here is an SQL statement that takes a join query:
SELECT * from Employss where first_name| | ' | | last_name = ' Beill Cliton ';
The above statement can be used to find out if there is a bill Cliton this employee, but it is important to note that the System optimizer does not use an index created based on last_name. When written in this SQL statement, the Oracle system can take an index created based on last_name.
where first_name = ' Beill ' and last_name = ' Cliton ';
(h) Order by statement
The order BY statement determines how Oracle will sort the returned query results. The ORDER BY statement has no special restrictions on the columns to be sorted, or it can be added to a column (like joins or additions). Any non-indexed item in the ORDER BY statement, or a computed expression, will slow down the query.
Double-check the order BY statement to find non-indexed items or expressions that degrade performance. The solution to this problem is to rewrite the order BY statement to use the index, or you can establish another index for the column you are using, and you should absolutely avoid using an expression in the ORDER BY clause.
We often use logical expressions in the WHERE clause when querying, such as greater than, less than, equal to, and not equal to, and can also use and (with), or (or), and not (non). Not can be used to negate any logical operation symbol. The following is an example of a NOT clause:
where not (status = ' VALID ')
If you want to use not, you should precede the phrase with parentheses and precede the phrase with the NOT operator. The NOT operator is included in another logical operator, which is the not equal to (<>) operator. In other words, the not is still in the operator, even if the not word is not explicitly added to the query where clause, see the following example:
Where status <> ' INVALID ';
For this query, it can be rewritten to not use not:
SELECT * FROM employee where salary<3000 or salary>3000;
Although the results of these two queries are the same, the second query scenario is faster than the first query scenario. The second query allows Oracle to use indexes on salary columns, while the first query cannot use indexes.
2. Impact of SQL Writing
(a) The effect of SQL on the same performance of the same function.
As a SQL in a programmer wrote for Select * from Zl_yhjbqk
B programmer writes for Select * from DLYX.ZL_YHJBQK (prefixed with table owner)
C Programmers write for Select * from Dlyx. ZLYHJBQK (uppercase table name)
The D programmer writes for Select * from Dlyx. Zlyhjbqk (more spaces in the middle)
The result of the above four SQL is the same as the execution time after the Oracle analysis, but from the Oracle shared memory SGA, it can be concluded that Oracle will analyze each SQL and consume shared memory. If you write the SQL string and format exactly the same, then Oracle will only parse once, and the shared memory will only leave a single analysis, not only to reduce the time to analyze SQL, but also to reduce the duplication of shared memory information, Oracle can accurately count the frequency of SQL execution.
(b) The order of conditions behind the where is affected
The condition order after the WHERE clause has a direct effect on the query of the large data scale. Such as:
SELECT * from zl_yhjbqk where dy_dj = ' 1KV or less ' and xh_bz=1 Select * from Zl_yhjbqk where xh_bz=1 and dy_dj = ' 1KV or less '
The above two SQL DY_DJ (voltage level) and XH_BZ (PIN household sign) Two fields are not indexed, so the execution is full table scan, the first SQL DY_DJ = ' 1KV below ' condition in the recordset ratio is 99%, and xh_bz=1 ratio is only 0.5%, At the time of the first SQL 99% records are compared Dy_dj and xh_bz, while in the second SQL 0.5% records are DY_DJ and xh_bz comparisons, so that the second SQL CPU utilization is significantly lower than the first one.
(c) Impact of query table order
The order of the list in the table following the from will have a performance impact on SQL, and with no indexes and no statistical analysis of the tables by Oracle, Oracle will be linked in the order in which the tables appear, so that the order of the tables is not the same as the data that is consuming the server resource. (Note: If the table is statistically analyzed, Oracle will automatically link the small table and then the large table)
3. Utilization of SQL statement indexes
(a) Some optimizations for the condition fields
fields that use function processing cannot take advantage of indexes such as:
substr (hbs_bh,1,4) = ' 5400 ', optimized processing: HBS_BH like ' 5400% ' trunc (SK_RQ) =trunc (sysdate), optimized processing: Sk_rq>=trunc (sysdate) and SK _rq<trunc (sysdate+1)
Fields that have been explicitly or implicitly operated cannot be indexed, such as: ss_df+20>50, optimized processing: ss_df>30
' X ' | | Hbs_bh> ' X5400021452 ', optimized handling:hbs_bh> ' 5400021542 ' sk_rq+5=sysdate, optimized processing: sk_rq=sysdate-5
hbs_bh=5401002554, optimized processing: hbs_bh= ' 5401002554 ', note: This condition implicitly to_number conversion for HBS_BH, because the Hbs_bh field is a character type.
A Field operation that includes multiple tables in the condition cannot be indexed , such as:
YS_DF>CX_DF, unable to optimize QC_BH | | Kh_bh= ' 5400250000 ', optimized processing: qc_bh= ' 5400 ' and kh_bh= ' 250000 '
4. More aspects of SQL optimized data sharing
(1) Select the most efficient table name order (valid only in the rule-based optimizer):
The ORACLE parser processes the table names in the FROM clause in a right-to-left order, and the FROM clause is written in the last table (the underlying table driVing Tsun) will be processed first, and in the case where the FROM clause contains more than one table, you must select the table with the lowest number of records as the underlying table. If you have more than 3 tables connected to the query, you need to select the crosstab (intersection table) as the underlying table, which refers to the table that is referenced by the other table.
(2) The connection order in the WHERE clause:
Oracle uses a bottom-up sequential parsing where clause, according to which the connection between tables must be written before other where conditions, and those that can filter out the maximum number of records must be written at the end of the WHERE clause.
(3) Avoid using ' * ' in the SELECT clause:
During the parsing process, Oracle translates ' * ' into all column names, which is done by querying the data dictionary, which means more time is spent.
(4) Reduce the number of access to the database:
Oracle does a lot of work internally: Parsing SQL statements, estimating index utilization, binding variables, reading blocks, and so on.
(5) in Sql*plus, sql*forms and pro*c reset the ArraySize parameter, you can increase the amount of data retrieved per database access, the recommended value is 200.
(6) Use the Decode function to reduce processing time:
Use the Decode function to avoid duplicate scans of the same record or duplicate connections to the same table.
(7) Integration of simple, unrelated database access:
If you have a few simple database query statements, you can integrate them into a single query (even if they are not related to each other).
(8) Delete duplicate records:
The most efficient method of deleting duplicate records (because of the use of rowID) Example:
DELETE from emp E where e.rowid > (SELECT MIN (x.rowid) from emp X where x.emp_no = E. EMP_NO).
(9) Replace Delete with truncate:
When you delete a record in a table, in general, the rollback segment (rollback segments) is used to hold information that can be recovered. If you do not have a COMMIT transaction, Oracle restores the data to the state it was before it was deleted (exactly before the delete command was executed) and when the truncate is applied, the rollback segment no longer holds any recoverable information. When the command runs, The data cannot be restored. So very few resources are invoked and execution times are short. (Translator Press: Truncate only in the Delete full table applies, truncate is DDL is not DML).
(10) Use commit as much as possible:
Whenever possible, commit is used as much in the program as possible, so that the performance of the program is improved and the requirements are reduced by the resources freed by the commit, the resources freed by the commit:
A. Information for recovering data on a rollback segment.
B. Locks acquired by program statements
C. Redo space in the log buffer
D. Oracle manages internal spending on 3 of these resources
(11) Replace the HAVING clause with a WHERE clause:
Avoid having a HAVING clause that filters the result set only after all records have been retrieved. This processing requires sorting, totals, and so on. If you can limit the number of records through the WHERE clause, you can reduce this overhead. (Non-Oracle) on, where, have the three clauses that can be added conditionally, on is the first execution, where the second, having the last, because on is the non-qualifying records filtered before the statistics, it can reduce the intermediate operation to process the data, It should be said that the speed is the fastest, where should also be faster than having to, because it filters the data before the sum, in two table joins only use on, so in a table, the left where and have compared. In the case of this single-table query statistics, if the conditions to be filtered do not involve the fields to be calculated, then they will be the same result, but where you can use the Rushmore technology, and have not, at the speed of the latter slow if you want to relate to the calculated field, it means that before the calculation, The value of this field is indeterminate, according to the workflow of the previous write, where the action time is done before the calculation, and having is calculated after the function, so in this case, the results will be different. On a multi-table join query, on has an earlier effect than where. The system first synthesizes a temporary table based on the conditions of the joins between the tables, then the where is filtered, then calculated, and then filtered by having. Thus, to filter the conditions to play the right role, first of all to understand when this condition should play a role, and then decided to put there.
(12) Reduce the query on the table:
In the SQL statement that contains the subquery, pay particular attention to reducing the query on the table. Example:
Select tab_name from TABLES where (tab_name,db_ver) = (select Tab_name,db_ver from tab_columns where VERSION = 604)
(13) Improve SQL efficiency with intrinsic functions:
Complex SQL often sacrifices execution efficiency. The ability to master the above application function to solve the problem is very meaningful in practical work.
(14) using the alias of the table:
When you concatenate multiple tables in an SQL statement, use the alias of the table and prefix the alias to each column. This reduces the time to parse and reduces the syntax errors caused by column ambiguity.
(15) Replace in with exists with not exists instead of in:
In many base-table-based queries, it is often necessary to join another table in order to satisfy one condition. In this case, using EXISTS (or not EXISTS) will usually improve the efficiency of the query. In a subquery, the NOT IN clause performs an internal sort and merge. In either case, not in is the least effective (because it performs a full table traversal of the table in the subquery). To avoid using not, we can change it to an outer join (Outer Joins) or not EXISTS.
(efficient) SELECT * from EMP (base table) where EMPNO > 0 and EXISTS (select ' X ' from DEPT where DEPT. DEPTNO = EMP. DEPTNO and LOC = ' Melb ') (inefficient) SELECT * from EMP (base table) WHERE EMPNO > 0 and DEPTNO in (SELECT DEPTNO from DEPT WHERE LOC = ' Melb ')
(16) Identify the SQL statement for ' inefficient execution ':
Although there are many graphical tools for SQL optimization, it is always a good idea to write your own SQL tools to solve the problem:
SELECT Executions, disk_reads, Buffer_gets, ROUND ((buffer_gets-disk_reads)/buffer_gets,2) Hit_radio, ROUND (DISK _reads/executions,2) Reads_per_run, sql_text from v$sqlarea WHERE executions>0 and buffer_gets > 0 and (buffer_gets-disk_reads)/buffer_gets < 0.8 ORDER by 4 DESC;
(17) Use Index to improve efficiency:
An index is a conceptual part of a table used to improve the efficiency of retrieving data, and Oracle uses a complex self-balancing b-tree structure. In general, querying data through an index is faster than a full table scan. When Oracle finds the best path to execute queries and UPDATE statements, the Oracle Optimizer uses the index. Also, using indexes when joining multiple tables can improve efficiency. Another advantage of using an index is that it provides the uniqueness of the primary key (primary key) Validation: Those long or long raw data types, you can index almost all the columns. In general, using indexes in large tables is particularly effective. Of course, you will also find that using indexes can also improve efficiency when scanning small tables. Although the use of indexes can improve the efficiency of query, but we must also pay attention to its cost. Indexes require space to store, and they need to be maintained regularly, and the index itself is modified whenever a record is added to a table or the index column is modified. This means that each record's insert, DELETE, and update will pay more than 4, 5 disk I/O. Because indexes require additional storage space and processing, those unnecessary indexes can slow query response time. It is necessary to periodically refactor the index:
ALTER INDEX <INDEXNAME> REBUILD <TABLESPACENAME>
(18) Replace distinct with exists:
Avoid using DISTINCT in the SELECT clause when submitting a query that contains one-to-many table information, such as a departmental table and an employee table. It is generally possible to consider replacing with exist, EXISTS makes the query faster because the RDBMS core module will return the results immediately after the conditions of the subquery have been met. Example:
(inefficient): Select DISTINCT dept_no,dept_name from DEPT D, EMP E WHERE d.dept_no = e.dept_no (efficient): SELECT Dept_no,dept_name From DEPT D where EXISTS (SELECT ' X ' from EMP E WHERE e.dept_no = d.dept_no);
SQL statements are capitalized, because Oracle always parses the SQL statements first, converting lowercase letters to uppercase.
(20) Use the connector "+" connection string sparingly in Java code!
(21) Avoid using not on indexed columns, usually to avoid using not on indexed columns, not to have the same effect as using functions on indexed columns. When Oracle "encounters" not, he stops using the index instead of performing a full-table scan.
(22) Avoid using calculations on indexed columns
Where clause, if the index column is part of a function. The optimizer will use a full table scan without using an index. Example:
Inefficient: SELECT ... From DEPT WHERE SAL * > 25000; efficient: SELECT ... From DEPT WHERE SAL > 25000/12;
(23) Replace > with >=
Efficient: SELECT * from emp where DEPTNO >=4 inefficient: SELECT * from emp where DEPTNO >3
The difference between the two is that the former DBMS will jump directly to the first record that dept equals 4 and the latter will first navigate to the Deptno=3 record and scan forward to the first record with a dept greater than 3.
(24) Replace or with union (for indexed columns)
In general, replacing or in a WHERE clause with Union will have a good effect. Using or on an indexed column causes a full table scan. Note that the above rules are valid only for multiple indexed columns. If a column is not indexed, the query efficiency may be reduced because you did not select or. In the following example, indexes are built on both loc_id and region.
Efficient: Select loc_id, Loc_desc, region from location WHERE loc_id = Ten UNION SELECT loc_id, Loc_desc, region from Locatio N WHERE region = "MELBOURNE" inefficient: Select loc_id, Loc_desc, region from location WHERE loc_id = ten OR region = "MELBOURNE"
If you persist in using or, you need to return the least logged index column to the front.
(25) Replace or with in
This is a simple and easy-to-remember rule, but the actual execution effect has to be tested, and under Oracle8i, the execution path seems to be the same.
Inefficient: SELECT .... From location WHERE loc_id = ten or loc_id = or loc_id = 30 efficient SELECT ... From location WHERE loc_in in (10,20,30);
(26) Avoid using is null and is not NULL on an indexed column
To avoid using any nullable columns in the index, Oracle will not be able to use the index. For single-column indexes, this record will not exist in the index if the column contains null values. For composite indexes, if each column is empty, the same record does not exist in the index. If at least one column is not empty, the record exists in the index. For example, if a uniqueness index is established on column A and column B of a table, and the table has a value of a, a and a record of (123,null), Oracle will not accept the next record (insert) with the same A, B value (123,null). However, if all the index columns are empty, Oracle will assume that the entire key value is empty and null is not equal to NULL. So you can insert 1000 records with the same key value, of course they are empty! Because null values do not exist in the index column, a null comparison of indexed columns in the WHERE clause causes Oracle to deactivate the index.
Inefficient: (index invalidation) SELECT ... From DEPARTMENT WHERE dept_code are not NULL; efficient: (index valid) SELECT ... From DEPARTMENT WHERE dept_code >=0;
(27) Always use the first column of an index:
If the index is built on more than one column, the optimizer chooses to use the index only if its first column (leading column) is referenced by a WHERE clause. This is also a simple and important rule that when referencing only the second column of an index, the optimizer uses a full table scan and ignores the index.
(28) Replace union with Union-all (if possible):
When the SQL statement requires a union of two query result sets, the two result sets are merged in a union-all manner and then sorted before the final result is output. If you use UNION ALL instead of union, this sort is not necessary. Efficiency will therefore be improved. It is important to note that the UNION all will output the same record in the two result set repeatedly. So you still have to analyze the feasibility of using union all from the business requirements. The UNION will sort the result set, which will use the memory of the sort_area_size. The optimization of this memory is also very important. The following SQL can be used to query the consumption of sorts
Inefficient: Select acct_num, Balance_amt from debit_transactions WHERE tran_date = ' 31-dec-95 ' UNION SELECT acct_num, BAL Ance_amt from debit_transactions WHERE tran_date = ' 31-dec-95 ' efficient: SELECT acct_num, Balance_amt from Debit_transactions WH ERE tran_date = ' 31-dec-95 ' UNION all SELECT acct_num, Balance_amt from debit_transactions WHERE tran_date = ' 31-dec-95 '
(29) Where to replace order by:
The ORDER by clause uses the index only under two strict conditions.
All columns in an order by must be in the same index and remain in the order in which they are arranged in the index.
All columns in the ORDER by must be defined as non-empty.
The index used in the WHERE clause and the index used in the ORDER BY clause cannot be tied.
Table Dept contains the following:
Dept_code PK NOT NULL DEPT_DESC NOT NULL DEPT_TYPE NULL
Inefficient: (Index not used) Select Dept_code from DEPT ORDER by dept_type Efficient: (using index) Select Dept_code from DEPT WHERE Dept_type > 0
(30) Avoid changing the type of indexed columns:
Oracle automatically makes simple type conversions to columns when comparing data of different data types.
Suppose Empno is an indexed column of a numeric type.
SELECT ... From EMP WHERE EMPNO = ' 123 '
In fact, after the Oracle type conversion, the statement translates to:
SELECT ... From EMP WHERE EMPNO = to_number (' 123 ')
Fortunately, the type conversion did not occur on the index column, and the purpose of the index was not changed.
Now, suppose Emp_type is an indexed column of a character type.
SELECT ... From EMP WHERE emp_type = 123
This statement is translated by Oracle to:
SELECT ... From EMP WHERE to_number (emp_type) =123
This index will not be used because of the type conversions that occur internally! To avoid the implicit type conversion of your SQL by Oracle, it is best to explicitly express the type conversions. Note When comparing characters to numbers, Oracle takes precedence over numeric types to character types.
Select emp_name form employee where salary > 3000
In this statement, if salary is of type float, the optimizer optimizes it to convert (float,3000), since 3000 is an integer, and we should use 3000.0 instead of waiting for the DBMS to convert at runtime when programming. Conversions of the same character and integer data.
(31) The WHERE clause to be careful:
The WHERE clause in some SELECT statements does not use an index. Here are some examples.
In the following example, (1) '! = ' will not use the index. Remember, the index can only tell you what exists in the table, not what does not exist in the table. (2) ' ¦¦ ' is a character join function. As with other functions, the index is deactivated. (3) ' + ' is a mathematical function. As with other mathematical functions, the index is deactivated. (4) The same index columns cannot be compared to each other, which will enable full table scanning.
A. If the number of records in a table that has more than 30% data is retrieved. Using indexes will not be significantly more efficient. B. In certain situations, using an index may be slower than a full table scan, but this is the same order of magnitude difference. In general, the use of indexes than the full table scan to block several times or even thousands of times!
(33) Avoid using resource-intensive operations:
SQL statements with Distinct,union,minus,intersect,order by will start the SQL engine to perform the resource-intensive sorting (sort) function. Distinct requires a sort operation, while the others need to perform at least two sorting. Typically, SQL statements with union, minus, and intersect can be overridden in other ways. If your database sort_area_size is well-provisioned, using union, minus, intersect can also be considered, after all, they are very readable.
(34) Optimize GROUP by:
Increase the efficiency of the group BY statement by filtering out unwanted records before group by. The following two queries return the same result but the second one is significantly faster.
Inefficient: The Select job, avg (SAL) from the EMP GROUP by job has a job = ' president ' OR job = ' MANAGER ' efficient: Select Job, AVG (SAL) from EMP WHERE job = ' president ' OR job = ' MANAGER ' GROUP by job
Optimization of SQL statements for database performance optimization
Start building with 50+ products and up to 12 months usage for Elastic Compute Service