Optimize the design of Oracle Index

Last Update:2014-06-15 Source: Internet

Author: User

Tags mathematical functions

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

V. Index Optimization Design

1. Manage organization Indexes

The index can greatly accelerate the database query speed. The index maps the logical value in the table to a safe RowID, so the index can quickly locate the physical address of the data. However, some DBAs find that the indexing of a large table does not improve the data query speed, but affects the performance of the entire database. This is mainly related to the data management method of SGA. When Oracle performs data block high-speed cache management, index data has higher resident permissions than normal data. During Space competition, ORACLE first removes normal data. When you query a large table with an index, index data may use up all the data block cache space. ORACLE has to perform disk read/write frequently to obtain data, therefore, after partitioning a large table, you can create a partition index based on the corresponding partition. If the data in such a large table is frequently queried, or the index is not created. In addition, when creating an index, DBA should try to ensure that the index is most likely to be used in the where clause. If only one index is created for the query, it will not necessarily speed up, because the index must specify a suitable access path. 2. Use of clustering

Oracle provides another method to increase the query speed, that is, clustering ). In simple words, clustering stores several tables together and stores them in a mixture of public attributes. The data of multiple tables is stored in the same Oracle block by the common code value of the cluster root data. Then, when a group of Oracle blocks is retrieved, the data of the two tables is obtained at the same time, in this way, the Oracle block to be stored can be reduced to improve the application performance.

3. Optimize the configured index to make full use of it to speed up database access. ORACLE needs to use an index. There are some basic conditions: 1) this field in the where subname must be the first field of the composite index; 2) and where subnames should not be involved in any form of calculation. Sal * (2*90/100)

Vi. Use of multiple CPU and Parallel Query PQO (Parallel Query Option)

1. Try to use multiple CPU processors for transaction processing and query

The rapid development of CPU makes ORACLE pay more and more attention to the application of multi-CPU Parallel technology. The access to a single database can be completed by the combination of multiple CPUs. In addition, distributed computing is already common, whenever possible, the CPU requests of the database server and the application should be separated, or the CPU requests should be moved from one server to another. For multiple CPU Systems, use Parallel Query Option (PQO, Parallel Query Option) for database operations.

2. Use Parallel Query Option (PQO, select Parallel Query) to Query data

With the PQO method, you can not only allocate SQL statement request processing among multiple CPUs, but also read data from independent processes when the queried data is on different disks.

3. Use the SQL * Loader Direct Path option to load a large amount of data

When this method is used for data loading, the program creates formatted data blocks and writes them directly to the data file. other I/O of the database kernel is not required.

7. Implement the system resource management allocation plan

ORACLE provides a Database Resource Manager (DRM) to control user Resource allocation. DBAs can use it to allocate system Resource percentages for user and job classes. In an OLDP system, 75% of CPU resources can be allocated to online users, and the remaining 25% is reserved for batch users. In addition, you can perform multi-level CPU allocation. In addition to CPU resource allocation, DRM can also limit the parallel operations on resource user groups.

8. Use the best Database Connection case with SQL Optimization

1. Use the direct ole db database connection method.

You can use ADO to connect to a database in two ways. One is the traditional ODBC method and the other is the ole db method. ADO is based on the ole db technology. To support ODBC, you must establish the call conversion from the corresponding ole db to ODBC. However, you do not need to convert the data using the direct ole db method, this increases the processing speed.

2. Use the Connection Pool mechanism

In database processing, the biggest resource cost is to establish a database connection, and the user still has a long connection wait time. The solution is to reuse the existing Connection, that is, use the Connection Pool object mechanism.

The Connection Pool principle is: a Connection buffer Pool is maintained in the IIS + ASP system, so that when the next user accesses, a database Connection is directly obtained in the Connection buffer Pool, instead of reconnecting to the database, the system response speed can be greatly improved.

3. Efficient SQL statement Design

Generally, the following methods can be used to optimize the performance of SQL statements on data operations:

(1) Reduce the number of queries to the database, that is, reduce the number of queries to the database by using distributed database objects such as snapshots and graphs.

(2) try to use the same or very similar SQL statements for queries. This not only makes full use of the analyzed syntax tree in the SQL sharing pool, the possibility of hitting the data to be queried in SGA is also greatly increased.

(3) restrict the use of dynamic SQL. Although dynamic SQL is useful, dynamic SQL performs syntax analysis again even if there is a completely identical query value in the SQL sharing pool.

(4) Avoid executing SQL statements without any conditions. When an SQL statement without any conditions is executed, it is usually required to perform the FTS. The database first locates a data block and then searches for other data in sequence, this is a long process for large tables.

(5) If you have constraints on the data in some tables, it is best to use the description integrity of the SQL statements in the table to be created, rather than in the SQL program.

(6) You can cancel the automatic COMMIT mode to aggregate SQL statements into a set of execution before committing them in a centralized manner. The program can also use COMMIT and ROLLBACL to submit and roll back the transaction explicitly.

(7) It takes a long time to retrieve a large amount of data. setting the number of row prefetch can improve the system performance and set a maximum value. When the SQL statement returns a row that exceeds this value, the numeric database temporarily stops execution unless the user sends a new command to start organizing and displaying data, rather than waiting for the user to continue.

9. Make full use of the data background processing solution to reduce network traffic

1. Create a temporary table or view reasonably

Creating a temporary table or view is to create a new table or view on the basis of the database as needed. A new table can be created for querying information after multi-Table Association. For a single-Table query, a view can be created, in this way, we can take full advantage of the large capacity and strong scalability of the Data Base. All condition judgment and numerical calculation statistics can be processed in the database server background and then appended to the temporary table, the process of forming data results can be implemented by the database process or function.

2. full use of database Packaging Technology

Use the database Description Language to compile the database process or function, and pack the process or function into a uniform running package in the database background.

3. Use of data replication, snapshots, views, and Remote Procedure Call Technology

Data Replication: Copies data to the local database at a time, so that local data will be used for future queries, but it is only suitable for those data with little changes. Snapshots can also be used to dynamically copy data between distributed databases, define the automatic snapshot refresh time or manual refresh, to ensure the integrity of reference data. The Remote Call process also greatly reduces network congestion caused by frequent SQL statement calls.

In short, there is no uniform solution to all performance problems, but ORACLE provides a rich selection environment, the architecture, software structure, model object, and specific business and technical implementation of the ORACLE database can be considered as a whole. To improve the system performance, a system-wide approach is required. During database optimization, the application, I/O subsystem, and operating system (OS) should be optimized accordingly. Optimization is the process of purposefully changing one or more components of the system to meet one or more goals. For Oracle, optimization aims to adjust the component level to improve performance, that is, to increase throughput and reduce response time. If DBA can comprehensively consider the optimization solution from the nine aspects above, it is believed that most ORACLE Applications can achieve optimal data access.

We should not only write SQL statements, but also write SQL statements with excellent performance. The following sections will help the author learn, extract, and summarize some materials to share with you!

(1) select the most efficient table name sequence (only valid in the rule-based Optimizer ):

The ORACLE parser processes the table names in the FROM clause in the order FROM right to left. The table written in the FROM clause (basic table driving table) will be processed first, when the FROM clause contains multiple tables, You must select the table with the least number of records as the base table.

If more than three tables are connected for query, You need to select an intersection table as the base table, which is the table referenced by other tables. put tables with less data at the end of the FROM

(2) join order in the WHERE clause .:

ORACLE uses the bottom-up sequence to parse the WHERE clause. According to this principle, the join between tables must be written before other WHERE conditions. The conditions that can filter out the maximum number of records must be written at the end of the WHERE clause.

(3) Avoid using '*' in the SELECT clause '*':

During the parsing process, ORACLE converts '*' into all column names in sequence. This is done by querying the data dictionary, which means it takes more time.

(4) Reduce the number of visits to the database:

ORACLE has performed a lot of internal work: parsing SQL statements, estimating index utilization, binding variables, and reading data blocks;

(5) re-set the ARRAYSIZE parameter in SQL * Plus, SQL * Forms, and Pro * C to increase the retrieval data volume for each database access. The recommended value is 200.

(6) use the DECODE function to reduce processing time: ******************************

You can use the DECODE function to avoid repeated scan of the same record or join the same table.

Decode (expression, search_1, result_1)

Decode (expression, search_1, result_1, search_2, result_2)

Decode (expression, search_1, result_1, search_2, result_2,..., search_n, result_n)

Decode (expression, search_1, result_1, default)

Decode (expression, search_1, result_1, search_2, result_2, default)

Decode (expression, search_1, result_1, search_2, result_2,..., search_n, result_n, default)

The comparison expression and search word of the decode function. If the expression matches, the return result is returned. If the expression does not match, the default value is returned. If the default value is not defined, the return value is null.

The following is a simple test to Decode the function usage:

SQL> create table t as select username, default_tablespace, lock_date from dba_users;

Table created.

SQL> select * from t;

USERNAME DEFAULT_TABLESPACE LOCK_DATE

---------------------------------------------------------------------

SYS SYSTEM

SYSTEM

OUTLN SYSTEM

CSMIG SYSTEM

SCOTT SYSTEM

EYGLE USERS

DBSNMP SYSTEM

Wmsys system 20--0-04

8 rows selected.

SQL> select username, decode (lock_date, null, 'unlocked', 'locked') status from t;

USERNAME STATUS

--------------------------------------

SYS unlocked

SYSTEM unlocked

OUTLN unlocked

CSMIG unlocked

SCOTT unlocked

EYGLE unlocked

DBSNMP unlocked

WMSYS locked

8 rows selected.

SQL> select username, decode (lock_date, null, 'unlocked') status from t;

USERNAME STATUS

--------------------------------------

SYS unlocked

SYSTEM unlocked

OUTLN unlocked

CSMIG unlocked

SCOTT unlocked

EYGLE unlocked

DBSNMP unlocked

WMSYS

8 rows selected.

(7) simple integration with no associated database access:

If you have several simple database query statements, you can integrate them into a single query (even if there is no relationship between them)

(8) delete duplicate records:

The most efficient way to delete duplicate records (because ROWID is used) is as follows:

Delete from emp e where e. ROWID> (select min (X. ROWID)

From emp x where x. EMP_NO = E. EMP_NO );

(9) replace DELETE with TRUNCATE:

When deleting records in a table, a rollback segment is usually used to store information that can be recovered. if you do not have a COMMIT transaction, ORACLE will recover the data to the State before the deletion (which is precisely the State before the deletion command is executed). When TRUNCATE is used, rollback segments no longer store any recoverable information. after the command is run, the data cannot be restored. therefore, few resources are called and the execution time is short. (The translator Press: TRUNCATE applies only to deleting the entire table, and TRUNCATE is DDL rather than DML)

(10) Try to use COMMIT as much as possible:

As long as possible, use COMMIT as much as possible in the program, so that the program's performance is improved, and the demand will also be reduced by the resources released by COMMIT:

Resources released by COMMIT:

A. Information used to restore data on the rollback segment.

B. Locks obtained by Program Statements

C. Space in redo log buffer

D. ORACLE manages the internal costs of the above three types of resources

(11) replace HAVING clause with the Where clause:

Avoid using the HAVING clause. HAVING filters the result set only after all records are retrieved. this process requires sorting, total, and other operations. if the WHERE clause can be used to limit the number of records, this overhead can be reduced. (in non-oracle) where on, where, and having can be added, on is the first statement to execute, where is the second clause, and having is the last clause, because on filters out records that do not meet the conditions before making statistics, it can reduce the data to be processed by intermediate operations. It is reasonable to say that the speed is the fastest, where should also be faster than having, because it performs sum only after filtering data, and on is used only when two tables are joined, so in a table, then we can compare where with having. In the case of single-Table query statistics, if the filter condition does not involve fields to be calculated, the results will be the same, but the where technology can be used, having cannot. The latter must be slow in terms of speed. If it involves a calculated field, it means that the value of this field is uncertain before calculation, according to the workflow written in the previous article, the where function is completed before computing, and having is used only after computing. In this case, the results are different. In multi-table join queries, on takes effect earlier than where. The system first combines multiple tables into a temporary table based on the join conditions between tables, then filters them by where, then computes them, and then filters them by having after calculation. It can be seen that to filter a condition to play a correct role, you must first understand when the condition should take effect, and then decide to put it there.

(12) Reduce table queries:

In SQL statements containing subqueries, pay special attention to reducing the number of queries to the table. Example:

SELECT TAB_NAME from tables where (TAB_NAME, DB_VER) = (SELECT

TAB_NAME, DB_VER FROM TAB_COLUMNS where version = 604)

(13) Improve SQL efficiency through internal functions .:

Complex SQL statements tend to sacrifice execution efficiency. It is very meaningful to grasp the above methods to solve problems by using functions.

(14) use the table Alias (Alias ):

When connecting multiple tables in an SQL statement, use the table alias and prefix the alias on each Column. in this way, the parsing time can be reduced and the syntax errors caused by Column ambiguity can be reduced.

(15) Replace IN with EXISTS and not exists instead of not in:

In many basic table-based queries, to meet one condition, you often need to join another table. in this case, using EXISTS (or not exists) usually improves the query efficiency. IN a subquery, the not in Clause executes an internal sorting and merging. IN either case, not in is the most inefficient (because it executes a full table traversal for the table IN the subquery ). to avoid the use of not in, we can rewrite it into an Outer join (Outer Joins) or not exists.

Example:

(Efficient) SELECT * from emp (basic table) where empno> 0 and exists (SELECT 1 from dept where dept. DEPTNO = EMP. deptno and loc = 'melb ')

(Inefficient) SELECT * from emp (basic table) where empno> 0 and deptno in (select deptno from dept where loc = 'melb ')

(16) Identifying 'inefficient execution' SQL statements:

Although a variety of graphical tools for SQL optimization are emerging, writing your own SQL tools is always the best way to solve the problem:

Select executions, DISK_READS, BUFFER_GETS,

ROUND (BUFFER_GETS-DISK_READS)/BUFFER_GETS, 2) Hit_radio,

ROUND (DISK_READS/EXECUTIONS, 2) Reads_per_run,

SQL _TEXT

From v $ SQLAREA

Where executions> 0

AND BUFFER_GETS> 0

AND (BUFFER_GETS-DISK_READS)/BUFFER_GETS

Order by 4 DESC;

(17) using indexes to improve efficiency:

An index is a conceptual part of a table to improve data retrieval efficiency. ORACLE uses a complex self-balancing B-tree structure. data Query by index is usually faster than full table scan. when ORACLE finds the optimal path for executing the query and Update statements, the ORACLE optimizer uses the index. using indexes when joining multiple tables can also improve efficiency. another advantage of using an index is that it provides uniqueness verification for the primary key .. For those LONG or long raw data types, You Can index almost all columns. generally, using indexes in large tables is particularly effective. of course, you will also find that using indexes to scan small tables can also improve efficiency. although the index can improve the query efficiency, we must pay attention to its cost. the index requires space for storage and regular maintenance. The index itself is also modified whenever a record is increased or decreased in the table or the index column is modified. this means that the INSERT, DELETE, and UPDATE operations for each record will pay four or five more disk I/O. because indexes require additional storage space and processing, unnecessary indexes will slow the query response time .. Regular index reconstruction is necessary: there is a "garbage file cleanup" in "System Maintenance cleanup"

ALTER INDEX REBUILD

(18) replace DISTINCT with EXISTS:

When you submit a query that contains one-to-many table information (such as the Department table and employee table), avoid using DISTINCT in the SELECT clause. in general, you can consider replacing it with EXIST, and EXISTS makes the query more rapid, because the RDBMS core module will return the result immediately after the subquery conditions are met. example:

(Inefficient ):

Select distinct DEPT_NO, DEPT_NAME from dept d, EMP E

Where d. DEPT_NO = E. DEPT_NO

(Efficient ):

SELECT DEPT_NO, DEPT_NAME from dept d where exists (SELECT 'x'

From emp e where e. DEPT_NO = D. DEPT_NO );

(19) SQL statements are written in upper case, because oracle always parses SQL statements first, converts lowercase letters to uppercase letters, and then executes

(20) try to use the connector "+" to connect strings in java code!

(21) Avoid using NOT in index columns,

We should avoid using NOT in the index column, NOT will have the same impact as using the function in the index column. when ORACLE Encounters "NOT", it stops using indexes and performs full table scanning.

(22) Avoid using computation on index columns.

In the WHERE clause, if the index column is part of the function, the optimizer will use full table scan without using the index.

Example:

Inefficiency:

SELECT... From dept where sal * 12> 25000;

Efficient:

SELECT... From dept where sal> 25000/12;

(23) Replace with> =>

Efficient:

SELECT * from emp where deptno> = 4

Inefficiency:

SELECT * from emp where deptno> 3

The difference between the two lies in that the former DBMS will jump directly to the first record whose DEPT is equal to 4, while the latter will first locate the record whose DEPTNO is = 3 and scan forward to the record whose first DEPT is greater than 3.

(24) replace OR with UNION (applicable to index columns)

In general, replacing OR in the WHERE clause with UNION will produce better results. using OR for index columns will scan the entire table. note that the preceding rules are only valid for multiple index columns. if a column is not indexed, the query efficiency may be reduced because you did not select OR. in the following example, both LOC_ID and REGION have indexes.

Efficient:

SELECT LOC_ID, LOC_DESC, REGION

FROM LOCATION

WHERE LOC_ID = 10

UNION

SELECT LOC_ID, LOC_DESC, REGION

FROM LOCATION

Where region = "MELBOURNE"

Inefficiency:

SELECT LOC_ID, LOC_DESC, REGION

FROM LOCATION

WHERE LOC_ID = 10 or region = "MELBOURNE"

If you insist on using OR, you need to write the index columns with the least records at the beginning.

(25) use IN to replace OR

This is a simple and easy-to-remember rule, but the actual execution results must be tested. in ORACLE8i, the execution paths of the two seem to be the same.

Inefficiency:

SELECT .... From location where LOC_ID = 10 OR LOC_ID = 20 OR LOC_ID = 30

Efficient

SELECT... From location where LOC_IN IN (10, 20, 30 );

(26) Avoid using is null and is not null in the index column.

To avoid using any columns that can be empty in the index, ORACLE will not be able to use this index. this record does not exist in the index if the column package contains null values. for a composite index, if each column is empty, this record does not exist in the index. if at least one column is not empty, the record is stored in the index. for example, if the unique index is created in column A and column B of the table, and the and B values of A record exist in the table are (123, null ), ORACLE will not accept the next record with the same A, B value (123, null) (insert ). however, if all index columns are empty, ORACLE considers the entire key value to be null, but null is not equal to null. therefore, you can insert 1000 records with the same key value. Of course, they are empty! Because the null value does not exist in the index column, the Null Value Comparison of the index column in The WHERE clause will disable ORACLE.

Inefficiency: (index failure)

SELECT... From department where DEPT_CODE is not null;

Efficient: (index valid)

SELECT... From department where DEPT_CODE> = 0;

(27) always use the first column of the index:

If an index is created on multiple columns, the optimizer selects this index only when its first column (leading column) is referenced by the where clause. this is also a simple and important rule. When only the second column of the index is referenced, the optimizer uses the full table scan and ignores the index.

(28) replace UNION with UNION-ALL (if possible ):

When the SQL statement needs to UNION two query result sets, these two result sets will be merged in the formula of UNION-ALL, and then sorted before the final result is output. if union all is used to replace UNION, sorting is unnecessary. the efficiency will be improved accordingly. note that union all will repeatedly output the same records in the two result sets. therefore, you still need to analyze the feasibility of using union all from the business needs. UNION sorts the result set. This operation uses SORT_AREA_SIZE memory. this memory optimization is also very important. the following SQL can be used to query the consumption of sorting

Inefficiency:

SELECT ACCT_NUM, BALANCE_AMT

FROM DEBIT_TRANSACTIONS

WHERE TRAN_DATE = '31-DEC-95'

UNION

SELECT ACCT_NUM, BALANCE_AMT

FROM DEBIT_TRANSACTIONS

WHERE TRAN_DATE = '31-DEC-95'

Efficient:

SELECT ACCT_NUM, BALANCE_AMT

FROM DEBIT_TRANSACTIONS

WHERE TRAN_DATE = '31-DEC-95'

UNION ALL

SELECT ACCT_NUM, BALANCE_AMT

FROM DEBIT_TRANSACTIONS

WHERE TRAN_DATE = '31-DEC-95'

(29) replace order by with WHERE:

The order by clause only uses indexes under two strict conditions.

All columns in order by must be included in the same index and maintained in the ORDER of the index.

All columns in order by must be defined as non-empty.

The index used BY the WHERE clause and the index used in the order by clause cannot be tied together.

For example:

The DEPT table contains the following columns:

DEPT_CODE PK NOT NULL

DEPT_DESC NOT NULL

DEPT_TYPE NULL

Inefficiency: (indexes are not used)

SELECT DEPT_CODE from dept order by DEPT_TYPE

Efficiency: (using indexes)

SELECT DEPT_CODE from dept where DEPT_TYPE> 0

(30) Avoid changing the index column type:

ORACLE automatically converts columns to different types of data.

Assume that EMPNO is a numeric index column.

SELECT... From emp where empno = '20140901'

In fact, after ORACLE type conversion, the statement is converted:

SELECT... From emp where empno = TO_NUMBER ('123 ')

Fortunately, the type conversion does not occur on the index column, and the purpose of the index is not changed.

Assume that EMP_TYPE is a character-type index column.

SELECT... From emp where EMP_TYPE = 123

This statement is converted:

SELECT... From emp where TO_NUMBER (EMP_TYPE) = 123

This index will not be used because of internal type conversion! To avoid implicit type conversion for your SQL statements, it is best to explicitly convert the type conversion. Note that when comparing the character and value, ORACLE will first convert the value type to the character type.

(31) WHERE clause to be careful:

The WHERE clause in some SELECT statements does not use indexes. Here are some examples.

In the example below, (1 )'! = 'No index is used. remember, indexes only tell you what exists in the table, but not what does not exist in the table. (2) '|' is a character concatenation function. as with other functions, indexes are disabled. (3) '+' is a mathematical function. as with other mathematical functions, indexes are disabled. (4) The same index Columns cannot be compared with each other, which enables full table scan.

(32) a. If the retrieved data volume exceeds 30% of the number of records in the table, using indexes will not significantly improve the efficiency.

B. in certain cases, using indexes may be slower than full table scanning, but this is an order of magnitude difference. in general, using an index is several times or even several thousand times more than a full table scan!

(33) Avoid resource-consuming operations:

SQL statements with DISTINCT, UNION, MINUS, INTERSECT, and order by will start the SQL engine.

Execute the resource-consuming sorting (SORT) function. DISTINCT requires a sorting operation, while other operations require at least two sorting operations. generally, SQL statements with UNION, MINUS, and INTERSECT can be rewritten in other ways. if your database's SORT_AREA_SIZE is well configured, you can also consider using UNION, MINUS, and INTERSECT. After all, they are highly readable.

(34) Optimize group:

To improve the efficiency of the group by statement, you can filter out unnecessary records before group by. The following two queries return the same results, but the second query is much faster.

Inefficiency:

Select job, AVG (SAL)

FROM EMP

GROUP JOB

Having job = 'President'

Or job = 'manager'

Efficient:

Select job, AVG (SAL)

FROM EMP

Where job = 'President'

Or job = 'manager'

GROUP JOB

Specify the index statement when ORACLE queries or deletes the query.

You can specify the index statement when querying.

SELECT/* + index (TB_ALIAS IX_G_COST3 )*/

TB_ALIAS .*

FROM g_Cost TB_ALIAS

WHERE Item_Two = 0

AND Flight_Date> = To_Date ('20140901', 'yyyymmdd ')

AND Flight_Date '123', 'yyyymmdd ');

You can also specify the index statement when deleting an index.

DELETE/* + index (TB_ALIAS IX_G_COST1 )*/

FROM g_Cost TB_ALIAS

WHERE ITEM_NAME = 'hourly Bill ';

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More