We should not only write SQL statements, but also write SQL statements with excellent performance. The following sections will help the author learn, extract, and summarize some materials to share with you!
(1)Select the most efficient table name sequence (valid only in the rule-based Optimizer ):
The Oracle parser processes the table names in the from clause in the order from right to left. The table written in the from clause (basic table driving table) will be processed first, when the from clause contains multiple tables, You must select the table with the least number of records as the base table. If more than three tables are connected for query, You need to select an intersection table as the base table, which is the table referenced by other tables.
(2)The connection sequence in the WHERE clause .:
Oracle uses the bottom-up sequence to parse the WHERE clause. According to this principle, the join between tables must be written before other where conditions. The conditions that can filter out the maximum number of records must be written at the end of the WHERE clause.
(3)Avoid '*' in the select clause '*':
During the parsing process, Oracle converts '*' into all column names in sequence. This task is completed by querying the data dictionary, which means it takes more time.
(4)Reduce the number of database accesses:
Oracle has performed a lot of internal work: parsing SQL statements, estimating index utilization, binding variables, and reading data blocks;
(5)You can reset the arraysize parameter in SQL * Plus, SQL * forms, and Pro * C to increase the retrieval data volume for each database access. The proposed value is 200.
(6)Use the decode function to reduce the processing time:
You can use the decode function to avoid repeated scan of the same record or join the same table.
(7)Simple Integration and unrelated database access:
If you have several simple database query statements, you can integrate them into a single query (even if there is no relationship between them)
(8)Delete duplicate records:
The most efficient way to delete duplicate records (because rowid is used) is as follows:
Deletefrom EMP e where E. rowid> (select Min (X. rowid)
From emp x where X. emp_no = E. emp_no );
(9)Replace Delete with truncate:
When deleting records in a table, rollbacksegments are usually used to store information that can be recovered. if you do not have a commit transaction, Oracle will recover the data to the State before the deletion (which is precisely the State before the deletion command is executed). When truncate is used, rollback segments no longer store any recoverable information. after the command is run, the data cannot be restored. therefore, few resources are called and the execution time is short. (The translator Press: truncate applies only to deleting the entire table, and truncate is DDL rather than DML)
(10)Use commit as much as possible:
As long as possible, use commit as much as possible in the program, so that the program's performance is improved, and the demand will also be reduced by the resources released by commit:
Resources released by commit:
A. Information used to restore data on the rollback segment.
B. Locks obtained by Program Statements
C. Space in redo log Buffer
D. Oracle manages the internal costs of the above three types of resources
(11)Replace having clause with the WHERE clause:
Avoid using the having clause. Having filters the result set only after all records are retrieved. this process requires sorting, total, and other operations. if the WHERE clause can be used to limit the number of records, this overhead can be reduced. (in non-Oracle) Where on, where, and having can be added, on is the first statement to execute, where is the second clause, and having is the last clause, because on filters out records that do not meet the conditions before making statistics, it can reduce the data to be processed by intermediate operations. It is reasonable to say that the speed is the fastest, where should also be faster than having, because it filters data before sum is performed, and on is used only when two tables are joined, so in a table, then we can compare where with having. In the case of single-Table query statistics, if the filter condition does not involve fields to be calculated, the results will be the same, but the where technology can be used, having cannot be used. In terms of speed, the latter must be slow. If a calculated field is involved, the value of this field is uncertain before calculation, according to the workflow written in the previous article, the where function time is completed before computing, and having works only after computing. In this case, the results are different. In multi-table join queries, on takes effect earlier than where. The system first combines multiple tables into a temporary table based on the join conditions between tables, then filters them by where, then computes them, and then filters them by having after calculation. It can be seen that to filter a condition to play a correct role, you must first understand when the condition should take effect, and then decide to put it there.
(12)Reduce queries to a table:
In SQL statements containing subqueries, pay special attention to reducing the number of queries to the table. Example:
Select tab_name from tables where (tab_name, db_ver) = (select
Tab_name, db_verfrom tab_columns where version = 604)
(13)Improve SQL efficiency through internal functions .:
Complex SQL statements tend to sacrifice execution efficiency. It is very meaningful to grasp the above methods to solve problems by using functions.
(14)Use the table alias (alias ):
When connecting multiple tables in an SQL statement, use the table alias and prefix the alias on each column. in this way, the parsing time can be reduced and the syntax errors caused by column ambiguity can be reduced.
(15)Replace in with exists and not exists instead of not in:
In many basic table-based queries, to meet one condition, you often need to join another table. in this case, using exists (or not exists) usually improves the query efficiency. in a subquery, the not in Clause executes an internal sorting and merging. in either case, not in is the most inefficient (because it executes a full table traversal for the table in the subquery ). to avoid the use of not in, we can rewrite it into an outer join (outer joins) or not exists.
Example:
(Efficient) Select * from EMP (basic table) Where empno> 0 andExists(Select 'x' from Dept where Dept. deptno = EMP. deptno and loc = 'melb ')
(Inefficient) Select * from EMP (basic table) Where empno> 0 and deptnoIn(Select deptno from Dept where loc = 'melb ')
(16) Identifying 'inefficient execution' SQL statements:
Although a variety of graphical tools for SQL optimization are emerging, writing your own SQL tools is always the best way to solve the problem:
Select executions, disk_reads, buffer_gets,
Round (BUFFER_GETS-DISK_READS)/buffer_gets, 2) hit_radio,
Round (disk_reads/executions, 2) reads_per_run,
SQL _text
From v $ sqlarea
Where executions> 0
And buffer_gets> 0
And (BUFFER_GETS-DISK_READS)/buffer_gets <0.8
Order by 4 DESC;
(17)Improve efficiency with indexes:
An index is a conceptual part of a table to improve data retrieval efficiency. Oracle uses a complex self-balancing B-tree structure. data Query by index is usually faster than full table scan. when Oracle finds the optimal path for executing the query and update statements, the Oracle optimizer uses the index. using indexes when joining multiple tables can also improve efficiency. another advantage of using an index is that it provides unique verification of the primary key .. For those long or long raw data types, You Can index almost all columns. generally, using indexes in large tables is particularly effective. of course, you will also find that using indexes to scan small tables can also improve efficiency. although the index can improve the query efficiency, we must pay attention to its cost. the index requires space for storage and regular maintenance. The index itself is also modified whenever a record is increased or decreased in the table or the index column is modified. this means that the insert, delete, and update operations for each record will pay four or five more disk I/O. because indexes require additional storage space and processing, unnecessary indexes will slow the query response time .. Regular index reconstruction is necessary .:
Alter index <indexname> rebuild <tablespacename>
(18)Replace distinct with exists:
When you submit a query that contains one-to-many table information (such as the Department table and employee table), avoid using distinct in the select clause. in general, you can consider replacing it with exist, and exists makes the query more rapid, because the RDBMS core module will return the result immediately after the subquery conditions are met. example:
(Inefficient ):
Select distinctdept_no, dept_name from Dept D, EMP E
Where D. dept_no = E. dept_no
(Efficient ):
Select dept_no, dept_name fromdept d Where exists (select 'x'
From EMP e wheree. dept_no = D. dept_no );
(19)SQL statements are written in uppercase. Because Oracle always parses SQL statements first, converts lowercase letters to uppercase and then executes them.
(20)Use the connector "+" to connect strings as few as possible in Java code!
InSyntax comparison with exist:
Select × from data table t where T. X in (...)
T. X field type value set, such as ('1', '2', '3'), but if T. when X is of the number type, it seems that such a write method will have a problem. It can also be a set of values queried through another SELECT statement, for example, (select y from data table 2 where ...).
Select * from data table t where [...] and exist (...)
Other query conditions are in square brackets. Exist can be any condition in the brackets. This condition can be irrelevant to the query outside, or can be combined with the condition outside. For example: (select * from data table 2 where 1 = 1) or (select * from data table 2 where Y = T. X)
Example:
SQL statement of in
Select ID, category_id, htmlfile, title, convert (varchar (20), begintime, 112) aspubtime
From tab_oa_pub where is_check = 1 and
Category_id in (select ID from tab_oa_pub_cate where no = '1 ')
Order by begintime DESC
Exists SQL statement
Select ID, category_id, htmlfile, title, convert (varchar (20), begintime, 112) aspubtime
From tab_oa_pub where is_check = 1 and
Exists (select ID from tab_oa_pub_cate wheretab_oa_pub.category_id = convert (INT, no) and no = '1 ')
Order by begintime DESC
Efficiency Comparison:
First, we will discuss in and exists.
Select * from T1 where X in (select y from T2)
In fact, it can be understood:
Select * from T1, (select distinct y from T2) T2 where t1.x = t2.y
If you have some experience in SQL optimization, You can naturally think that T2 cannot be a big table, because it is necessary to perform "unique sorting" on the entire table of T2 ", if T2 is large, the sorting performance is intolerable. But T1. why? The most common understanding is that t1.x = t2.y can be indexed. But this is not a good explanation. Imagine if t1.x and t2.y both have indexes, we know that indexes are ordered. Therefore, the best solution between T1 and T2 is merge join. In addition, if t2.y has an index, the sorting performance of T2.
Select * from T1 where exists (select null from T2 where Y = X)
It can be understood:
For X in (select * from T1)
Loop
If (exists (select null from T2 where Y = x. x) then
Output the record!
End if
End Loop
This is easier to understand. T1 is always a table scan! Therefore, T1 cannot be a large table, while T2 can be large, because y = x. x can be indexed by t2.y.
Based on the above discussions on in/exists, we can draw a general conclusion:InIt is suitable for the case where the external table is large but the internal table is small; exists is suitable for the case where the external table is small but the internal table is large.
There are two simple examples to illustrate the efficiency of "exists" and "in ".
1) Select * from T1 where exists (select 1 from T2 where t1.a = t2.a );
Small Data size of T1 and large data size of T2, t1 <t2, 1) high query efficiency.
2) Select * from T1 where t1.a in (select t2.a from T2 );
T1 has a large data volume and T2 has an hour, T1> T2, 2) High query efficiency.
Exists usage:
Note: 1) the section in the sentence contains a color font to understand its meaning;
"Select 1 from T2 where t1.a = t2.a" is equivalent to a join Table query, which is equivalent
"Select 1 from T1, T2 where t1.a = t2.a"
However, if you execute the statement in parentheses (1), a syntax error will be reported, which is also worth attention when using exists.
"Exists (XXX)" indicates whether the statement in parentheses can identify the record and whether the record to be queried exists.
Therefore, the "1" in "select 1" is irrelevant. It is okay to replace it with "*". It only cares whether the data in the brackets can be searched out, whether such a record exists. If so, the where condition of the sentence is true.
Usage of in:
Continue to reference the above example
"2) Select * from T1 where t1.a in (select t2.a from T2 )"
The content of the field searched by the statement following the "in" must correspond to each other. Generally, the expression of field a in Table T1 and table t2 must be the same, otherwise, this query is meaningless.
For example, table T1 and table T2 have a field indicating the ticket number. However, table T1 indicates that the ticket number field is named "ticketid" and table T2 indicates "ID ", however, the expression is the same, and the data format is the same. In this case, use the 2) method as follows:
"Select * from T1 where t1.ticketid in (select t2.id from T2 )"
Select name from employee where name not in (Select name fromstudent );
Select name from employee where not exists (Select name fromstudent );
The first SQL statement is less efficient than the second statement.
By using exists, Oracle first checks the primary query, and then runs the subquery until it finds the first match, which saves time. When Oracle executes the in subquery, it first executes the subquery and stores the obtained result list in a temporary table with an index. Before executing a subquery, the system suspends the primary query. After the subquery is executed, it is stored in the temporary table and then executes the primary query. This is why exists is faster than in queries.
Differences between in and exists
1) The later tables use exists for high efficiency, and the later tables use in for high efficiency.
2) it is inappropriate to specify the exists clause for non-correlated subqueries.
3) No matter which table is large, not exists is faster than not in.
4) Try not to use the not in clause. The minus clause is faster than the not in clause,
In performs hash join on the External table and the internal table, while exists performs loop on the External table. Each loop then queries the internal table.
The argument that exists is more efficient than in is always inaccurate.
If the two tables to be queried are of the same size, there is little difference between in and exists.
If one of the two tables is small and the other is a large table, use exists for the large subquery table and in for the small subquery table: exists for the later table and in for the later table
Example: Table A (small table) and Table B (large table)
1:
Select * from a where CC in (select CC from B)
Low Efficiency: the CC column index of Table A is used;
Select * from a where exists (select CC from B where cc = A. CC)
High Efficiency: the CC column index of Table B is used.
Opposite
2:
Select * from B where CC in (select CC from)
High Efficiency: the CC column index of Table B is used;
Select * from B where exists (select CC from a where cc = B. CC)
Low Efficiency: the CC column index of Table A is used.
Correlated subqueries with in are redundant because the in clause provides the same functions as related operations in subqueries. For example:
Select staff_name from staff_member where staff_id in
(Select staff_id from staff_func wherestaff_member.staff_id = staff_func.staff_id );
It is not appropriate to specify the exists clause for a non-correlated subquery, because this will produce the flute card product. For example:
Select staff_name from staff_member where staff_id
Exists (select staff_id from staff_func );
Not in and not exists
If the query statement uses not in, all the internal and external tables are scanned, and no index is used;
However, subqueries of not extsts can still use table indexes.
Therefore, no matter which table is large, not exists is faster than not in.
Try not to use the not in clause. The minus clause is faster than the not in clause, although the minus clause requires two queries:
Select staff_name from staff_member where staff_id in (select staff_id fromstaff_member minus select staff_id from staff_func where func_id like '123 ');
Difference between in and "="
Select name from student where name in ('zhang ', 'wang', 'lil', 'zhao ');
And
Select name from student where name = 'zhang' or name = 'lil' orname = 'wang' or name = 'zhao'
The results are the same.