Introduction to ORACLE indexes and high-performance SQL
Source: Internet
Author: User
What is an index?
An index is an auxiliary object created in one or more columns of a table. It aims to accelerate access to data in the table;
The data structure of the Oracle storage index is B * tree, and the same is true for the bitmap index, except that the leaf nodes have different B * Number indexes;
An Index consists of the root node, branch node, and leaf node. The parent index block contains the index data of the lower-level index block. The leaf node contains the index data and the rowid used to determine the actual location of the row.
Purpose of using Indexes
Faster Query
Reduce I/O operations
Eliminate disk sorting
When to use Indexes
Query the number of returned records
Sort table <40%
Non-sorted table <7%
Table shards are large (frequent increase and deletion)
Index type
Non-unique index (most commonly used)
Unique Index
Bitmap Index
Partial partitioned index with a prefix
Partial index partitioning without prefix
Globally prefix partitioned Index
Hash partition Index
Function-based index
Guidelines for managing Indexes
Insert data into the table and create an index
. After using SQL * Loader or import tool to insert or load data, index creation is more effective;
Tables and columns with correct Indexes
. Frequently retrieve rows in the sorting large table (40%) or non-sorting table (7%). We recommend that you create an index;
. To improve multi-Table Association, index columns are used for joining;
. The values in the column are relatively unique;
. Value range (large: B * tree index, small: bitmap index );
. Date columns are generally suitable for function-based indexing;
. Columns have many null values and are not suitable for indexing.
Arrange index columns for performance
. Multiple fields are often used together to retrieve records. The combined index is more effective than a single index;
. Put the most common columns at the beginning, for example, dx_groupid_serv_id (groupid, serv_id). In the where condition, use groupid, groupid, and serv_id. indexes are used for queries. If only the serv_id field is used, the index is invalid;
. Merge/split unnecessary indexes.
Limit the number of indexes for each table
. A table can have hundreds of indexes (will you do this ?), However, for frequently inserted and updated tables, the more systems the index has, the more CPU the I/O burden is;
. We recommend that each table have no more than five indexes.
Delete an index that is no longer needed
. The index is invalid because function-based or Bitmap indexes are used instead of B * tree indexes;
. Queries in applications do not use indexes;
. You must delete the index before recreating the index. If you use alter index... If you rebuild an index, you do not have to delete the index.
Index data block space usage
. Specify a tablespace when creating an index, especially when creating a primary key;
. Set pctfress properly. Note: you cannot specify pctused for the index;
. Estimate the index size and reasonably set storage parameters. The default value is the tablespace size, or the initial is set to the same size as next.
Consider creating indexes in parallel
. You can create indexes in parallel for large tables. When you create indexes in parallel, the storage parameters are used by each query server process. For example, the initial value is 1 M and the degree of parallelism is 8, at least 8 MB of space is consumed during index creation;
Consider using nologging to create an index
. You can use nologging to create indexes for large tables to reduce the number of redo logs;
. Saves space for redo log files;
. Shorten the index creation time;
. This improves the performance of parallel creation of large indexes.
How to create an optimal Index
Create an index explicitly
Create index index_name on table_name (field_name)
Tablespace tablespace_name
Pctfree 5
Initrans 2
Maxtrans 255
Storage
(
Minextents 1
Maxextents 16382
Pctincrease 0
);
Create a function-based index
. It is commonly used in functions such as UPPER, LOWER, and TO_CHAR (date). For example:
Create index idx_func on emp (UPPER (ename) tablespace tablespace_name;
Create a bitmap Index
. When you create an index for a column with a small base and a relatively stable base, you must first consider the bitmap index. For example:
Create bitmap index idx_bitm on class (classno) tablespace tablespace_name;
Explicitly create a unique index
. You can use the create unique index statement to create a unique index. For example:
Create unique index dept_unique_idx on dept (dept_no) tablespace idx_1;
Create a constraint-Related Index
. You can use the using index statement to create indexes related to the unique and primary key constraints. For example:
Alter table table_name
Add constraint PK_primary_keyname primary key (field_name)
Using index tablespace tablespace_name;
How to create a local partition Index
. The basic table must be a partition table;
. The number of partitions is the same as that of the base table;
. The number of subpartitions in each index partition is the same as that in the corresponding basic table partition;
. The row index in the subpartition of the base table is stored in the corresponding subpartition of the index. For example:
Create Index TG_CDR04_SERV_ID_IDX On TG_CDR04 (SERV_ID)
Pctfree 5
Tablespace TBS_AK01_IDX
Storage (
MaxExtents 32768
PctIncrease 0
FreeLists 1
FreeList Groups 1
)
Local
/
How to create a global index for a range Partition
. Basic Tables can be Global tables and partition tables.
Create index idx_start_date on tg_cdr01 (start_date)
Global partition by range (start_date)
(Partition p01_idx vlaues less than ('20140901 ')
Partition p01_idx vlaues less than ('20140901 ')
...
Partition p01_idx vlaues less than ('20140901 '))
/
Rebuilding existing Indexes
Rebuilding the current time of an existing index does not affect queries;
Re-indexing can delete additional data blocks;
Improves index query efficiency;
Alter index idx_name rebuild nologging;
For partition indexes:
Alter index idx_name rebuild partition partiton_name nologging;
Reason for deleting the index
. Indexes that are no longer needed;
. The index does not provide the expected performance improvement for the queries published on the relevant tables;
. The application does not use this index to query data;
. The index is invalid and must be deleted before reconstruction;
. The index has become too broken and must be deleted before reconstruction;
. Statement: drop index idx_name; drop index idx_name drop partition partition_name;
Index creation cost
During basic table maintenance, the system must maintain indexes at the same time. unreasonable indexes will seriously affect system resources, mainly on CPU and I/O;
Inserting, updating, and deleting data results in a large number of db file sequential read lock waits;
SQL optimizer Introduction
Rule-based Optimizer
. Always use index
. Always start from the driver table (the rightmost table of the from clause)
. Full table scan is used only when it is inevitable
. Any index can
Cost-based Optimizer
. Statistical data for tables and Indexes
Analyze table customer compute statistics;
Analyze table customer estimate statistics sample 5000 rows;
. Set the degree of parallelism and table partitions in a table
Optimizer Mode
Rule Mode
. Generally, the rules are ignored Based on the CBO and statistical information.
Choose Mode
. Oracle selects rule or first_rows or all_rows as needed
First_rows Mode
. Returning records at the fastest speed based on the cost will lead to a decrease in the overall query speed or consume more resources, tends to index scanning, suitable for OLTP Systems
All_rows Mode
. Cost-based, to ensure the minimum overall query time, tends to parallel full table Scanning
For example:
Select last_name from customer order by last_name; When first_rows is used, a record is returned quickly. However, if I/O is large, the returned record is slow when all_rows is used, but less resources are used.
Adjust SQL Table Access
Full table Scan
. Returned records: unordered tables> 40%, sorted tables> 7%. We recommend that you use parallel mechanisms to increase the access speed, DDS;
Index access
. The most common methods include unique index scan and index range scan, OLTP;
Fast and complete index Scanning
. Access all data blocks in the index. The result is equivalent to full table scan. You can use index scan instead of full table scan. For example:
Select serv_id, count (*) from tg_cdr01 group by serv_id;
. Single query parallelization
Select/* + full (emp) parallel (emp, 8) */* from emp;
Partitioned Tables have obvious effects.
Optimize SQL statement sorting
Sort operation:
. Order by clause
. Group by clause
. Select distinct clause
. When creating an index
. Union or minus
. Sort and merge connections
How to Avoid sorting
. Add Index
. Use the distinct clause in the Index
. Avoid sort merge connections
Use tips to adjust
Tips
. Syntax:/* + hint */
. Use table alias: select/* + index (e dept_idx) */* from emp e
. Inspection prompt
Common tips
. Rule
. All_rows
. First_rows
. Use_nl
. Use_hash
. Use_merge
. Index
. Index_asc
. No_index
. Index_desc (usually used with max built-in functions)
. Index_combine (mandatory bitmap index)
. Index_ffs (fast and complete index scanning)
. Use_concat (use union all for all or conditions in the query)
. Parallel
. Noparallel
. Full
. Ordered (cost-based)
Adjust table connection
Table connection type
. And other connections
Using equations in the where condition;
. External connections (left and right connections)
Put a (+) operator in the Equality predicate of the where Condition Clause. For example:
Select a. ename, B. comm from emp a, bonus B where a. ename = B. ename (+ );
This statement returns records of all emp tables;
. Self-connection
Select a. value total, B. value hard, (A. value-B. value) soft,
Round (B. value/a. value) *) perc
From v $ sysstat a, v $ sysstat B
Where a. statistic # = 179
And B. statistic # = 180;
Anti-connection
Anti-join is often used in not in or not exists. It refers to a subquery in which any records found in the query are not included in the result set. It is not recommended to use not in or not exists;
. Semi-join
Exists is used in the query. Meaning: Even if multiple duplicate records are returned in the subquery, only one record is returned in the external query.
Nested loop connection
. Used when an index exists in the connected table;
. Use use_nl.
Hash connection
. The Hash connection loads the driver table into the memory and uses the hash technology to connect to the second table to increase the connection speed.
. Suitable for connecting large tables and small tables;
. Use use_hash.
Sort and merge connections
. Sort merge connections do not use Indexes
. Usage principles:
No available indexes exist in the child segment of the connected table;
The query returns most of the data in the two tables;
CBO believes that full table scan is faster than index scan.
. Use use_merge
Use temporary/intermediate tables
When associating multiple large tables, you can store the result sets that meet the conditions in the intermediate table, and then associate them with the intermediate table;
SQL subquery Adjustment
Associated and non-associated subqueries
. Join: The subquery references an External table internally and runs each row once;
. Non-join: The subquery is executed only once and stored in memory.
Adjust not in and not exists statements
. You can use an external connection to optimize the not in clause, for example:
Select ename from emp where dept_no not in
(Select dept_no from dept where dept_name = 'Math ');
Changed:
Select ename from emp, dept
Where emp. dept_no = dept. dept_no
And dept. dept_name is null;
Use indexes to adjust SQL statements
Why Does Oracle not use indexes?
. Check whether the indexed column or the first column of the composite index appears in the WHERE clause of the PL/SQL statement. This is a necessary condition for the execution plan to use the relevant index.
. See which type of connection is used. ORACLE supports Sort Merge Join (SMJ), Hash Join (HJ), and Nested Loop Join (NL ). When two tables are connected and the target column of the internal table has an index, only the Nested Loop can effectively use the index. Even if an index is built on the relevant column, SMJ can only avoid data sorting because of the existence of the index. Due to HASH calculation, the existence of Indexes has almost no impact on the data query speed.
. Check whether related indexes are allowed in the connection sequence. Assume that the deptno column of the table emp has an index, and the deptno column of the table dept has no index. The WHERE statement has the condition emp. deptno = dept. deptno. During the NL connection, emp is first accessed as the External table. Due to the connection mechanism, the External table data is accessed in full table scan and emp. the index on deptno is obviously not used. A full index scan or quick full index scan can be performed on deptno.
. Whether the system data dictionary table or view is used. Because the system data dictionary tables have not been analyzed, the execution plan may be very poor ". However, do not analyze the data dictionary tables without authorization. Otherwise, a deadlock may occur or the system performance may degrade.
. Whether the index column is a function parameter. If so, indexes cannot be used during queries.
. Whether there is a potential data type conversion. For example, if you compare the numeric data with the numeric data, ORACLE automatically converts the numeric data using the to_number () function, resulting in the occurrence of the previous phenomenon.
. Whether to collect sufficient statistics for tables and related indexes. It is recommended that you analyze tables and indexes on a regular basis for tables with frequent data additions, deletions, and changes. You can use the SQL statement "analyze table xxxx compute statistics for all indexes ;". ORACLE can make the right choice only when it fully reflects the actual statistical data.
. The selection of index columns is not high. We assume that the emp table contains 1 million rows of data, but the emp. deptno column contains only four different values, such as 10, 20, 30, and 40. Although emp has many data rows, ORACLE determines that the values of the columns in the table are evenly distributed across all data rows by default. That is to say, each deptno value corresponds to 0.25 million data rows. Assume that the SQL search condition is DEPTNO = 10, and the efficiency of data search using the index on the deptno column is usually not higher than that of full table scan.
. Whether the index column value can be NULL ). If the index column value can be NULL, indexes, such as COUNT (*), are not used in SQL statements to return NULL values. Instead, full table scanning is used. This is because the stored values in the index cannot be empty.
. Check whether parallel query (PQO) is useful ). Indexes are not used for parallel queries.
. If no reason can be found from the above aspects, we have to use the hint method in the statement to force ORACLE to use the optimal "Execution Plan ". Hint adopts the annotation method, which can be line comment or segment comment. If you want to use the IND_COL1 INDEX of Table A, you can use the following method: "SELECT/* + INDEX (A IND_COL1) */* from a where COL1 = XXX ;"
How to shield Indexes
When there is a bad index in the statement execution plan, you can artificially block the index. Method:
. Numeric: Add 0 to the index field, for example
Select * from emp where emp_no + 0 = v_emp_no;
. Type: Add ''to the index field, for example
Select * from tg_cdr01 where msisdn | ''= v_msisdn;
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.