A preliminary understanding of Oracle B-tree indexes

Last Update:2018-02-10 Source: Internet

Author: User

Tags keyword list mysql index

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

A preliminary understanding of Oracle B-tree indexes

Today, we will study Oracle's B-tree indexes. Through this article, you will know which types of Oracle B-tree indexes are available and how Oracle B-tree indexes are implemented, oracle uses the B-tree index to retrieve data and limits B *-tree indexes, and the difference between the B-tree indexes of Oracle and mysql.

I. subtypes of B-tree indexes in Oracle:

B * tree indexes are the most common indexes in Oracle and most other databases. B * tree structures are similar to binary Trees. However, "B" here does not represent binary ), balanced means that the B * tree index has the following subtypes:

1) index organized table: The index organization table is stored in the B * tree structure. We know that the default Oracle table is a heap table, heap tables are stored in an unorganized manner (data can be stored as long as there is available space), and IOT is different from it, data in IOT is stored and sorted in the order of primary keys. For applications, IOT performance is no different from that in conventional heap tables. You need to use SQL to access IOT correctly, IOT is most useful for information retrieval, spatial systems, and OLAP applications. In a simple summary, an index is used to organize a table. indexes are data, and data is indexes, because data is stored in the B * tree structure.

2) B * tree clustering index (B * tree cluster index): based on clustering keys (such as age = 27), keys in the traditional btree index point to one row, different from B * tree clustering, a clustering key points to a block, which contains multiple rows related to the clustering key,

3) descending index: allows data to be sorted in the order of "from large to small" in the index structure, rather than "from small to large (ascending, when you query data and finally sort oder by A desc and B asc, you can create A descending index to avoid expensive sorting (sort order by) operations, create the statement as follows:

SQL> create index idex_name on table_name (A desc, B asc );

4) reverse key index: this is also a B-tree index, but the key bytes will be "Reversed". If the index is filled with an incremental value, index entries can be more evenly distributed in the index, mainly to solve the competition of "right" index leaf blocks, such as in an Oracle RAC environment, some columns are filled with a sequence value or timestamp. The index created for these columns belongs to the "right" index, that is, the relative concentration of data distribution. The biggest advantage of using reverse indexes is to reduce the competition for index leaf blocks, reduce index hotspot blocks, and improve system performance.

1. Reverse indexing applications

1) used to discover that index leaf blocks become hotspot Blocks

Generally, when data is used (usually in batch insert operations), it is concentrated in a continuous data range. Therefore, when normal indexes are used, the index leaf blocks may become overheated, in severe cases, the system performance may degrade.

2) used in the RAC Environment

When several nodes in the RAC environment access data in a centralized and intensive manner, the probability of index hotspot blocks occurring is high. If the system does not have high requirements for range search, you can use reverse indexing to improve the system performance. Therefore, this technology is more common in the RAC environment, which can significantly reduce the competition for index blocks.

2. disadvantages of using reverse Indexes

Due to the characteristics of the reverse index structure, if range scanning is often used in the system to read data (for example, the where clause uses the "between and" statement or the comparison operator ">" <), reverse indexes will not apply, because a large number of full table scans will occur, which will reduce the system performance.

II. Implementation Principles of B-tree indexes in Oracle:

The structure of a classic BTree index is as follows:

Each node occupies the disk space of one disk block. Each node has n keywords in ascending order and (n + 1) pointer to the Child root node (the medium keyword is 51,101,151 ....., then 0 to 50 correspond to a pointer, and 51 to 100 correspond to a pointer). This pointer stores the address of the disk block where the sub-node is located (note that n is used to create an index, based on the data size, if the data size is too large, the layer-3 may not be able to meet the requirements, and the layer-4 B + tree is required. Then, n keywords are divided into (n + 1) and then each range field corresponds to a pointer to point to the child node. The child node is further divided based on the keyword, and then the Pointer Points to the leaf node, and all leaf nodes are on the same layer of the tree. This means that all traversal from the index root node to the leaf node will access the same number of blocks, that is to say, the same number of I/O will be executed, in other words, the index is highly balanced,

Is 0... 50 corresponds to a pointer pointing to a subnode; 51... 100 corresponds to a pointer pointing to another subnode. Then, the subnode divides the region based on the keyword and points the pointer to the leaf node, it is worth noting that Oracle B * tree indexes store data on leaf nodes (or leaf blocks) and index key values (or index column values) and a rowid (pointing to a pointer or physical location of the indexed row), as shown in, there is a two-way linked list between leaf nodes to improve the efficiency of index range scanning, because the column values of the index values are ordered, after finding the starting value, you can directly locate the next value in the adjacent order, for example, where id between 10 and 20, oracle finds the first leaf block with a minimum key value greater than or equal to 10, and then traverses the linked list of the leaf node horizontally until the last value is greater than 20;

Note that the B * index does not have a non-unique limit, that is, you can create a B * index on a non-unique column, but in a non-unique index, oracle will append rowid to the key as an additional column (with one Length Byte) to make the key unique. For example, there is a create index index_name on table (x, y) index, conceptually, it is create unique index index_name on table (x, y, rowid ). in a unique index, Oracle does not add rowid to the index key based on the uniqueness You define. In a non-unique index, you will find that data is first sorted by the index key value, then sort by rowid in ascending order, while in a unique index, data is only sorted by the index key value;

Iii. Data Retrieval Using the B * Tree Index.

Principles of B + tree indexes (Modified from the network ):

Then, we simulate the specific process where id = 29 :.

First, find disk Block 1 based on the root node and read it into the memory. Disk I/O operation 1st times]

Compare the keyword 29 in the range (), find the pointer P2 of disk Block 1.

Locate disk block 3 based on the P2 pointer and read it into the memory. Disk I/O operation 2nd times]

When the comparison keyword 29 is in the range (), find the pointer P2 of disk Block 3.

Find disk block 8 Based on the P2 pointer and read it into the memory. Disk I/O operation 3rd Times]

In the disk block 8 keyword list, find the keyword 29.

After analyzing the above process, it is found that three disk I/O operations and three Memory search operations are required. Because the keywords in the memory are an ordered table structure, you can use the binary lookup method to improve efficiency. Three disk I/O operations affect the efficiency of the entire B-Tree search.

Iv. Restrictions on Oracleb * tree indexes

1) Use a function in the index column. For example, SUBSTR, DECODE, INSTR, and so on. You need to create a function index to solve the problem.

For example, the dept table contains col_1 and col_2. The upper function index for col_1 is as follows:

Create index index_name ON dept (upper (col_1 ));

Function indexes are cost-based optimization methods-CBO. (In Oracle8 and later versions, we recommend using CBO instead of RBO for Oracle strong columns ), therefore, tables can be used only after analyze or hints;

2) the newly created table does not have time to generate statistical information. Just analyze it. We know that the Oracle optimizer judges the execution plan based on statistical information. If the statistical information is inaccurate, therefore, Oracle may make an execution plan without any index.

3) The Oracle optimizer cbo is based on the cost analysis of cost. The accessed table is too small and the consumption of full table scan is less than that of indexes.

4) use <>, not in, and not exist. in most cases, the result set is considered to be large, generally, if the value is greater than 5%-15%, full table scan (FTS) is performed without indexing ).

5) like "% _" percentage before.

6) independently reference index columns not at the first place in the composite index. Like Oracle and mysql, The btree index is the leftmost matching principle. When you create A composite index (A, B, C) it is equivalent to creating three indexes (A), (A, B), and (A, B, C;

7) When a numeric field is set to a number, no quotation marks are added in the where condition. Here, Oracle uses a function for implicit conversion, so it can be attributed to the first class. Using a function causes index failure, it is worth noting that the implicit conversion of VARCHAR2-> NUMBER can take the index; the implicit conversion of NUMBER-> VARCHAR2 will cause the index to become invalid. (VARCHAR2-> NUMBER does not invalidate the index. You can convert it to where id = to_number ('123 '). NUMBER-> VARCHAR2 will invalidate the index. I guess it is converted to where to_number (name) = 123)

8) When the variable uses the times variable and the table field uses the date variable, or the opposite is true.

9) If the index is INVALID (INVALID), you can consider rebuilding the index and alter index index_name rebuild online ;.

10) the B-tree index is null and is not null;

V. Differences between B-tree indexes in Oracle and mysql

In fact, the structure and principle of the B-tree index of Oracle and mysql are very similar, but the Oracle leaf node stores the key value + rowid. The content stored by the mysql index leaf node varies with the storage engine, there are also primary key indexes and secondary indexes as follows:

Oracle leaf nodes store the key value + rowid

Content stored by the leaf node in the MyISAM engine:

Primary Key Index: only store row pointers;

Secondary index: only a row pointer;

Content stored by leaf node in InnoDB Engine

Primary Key Index: clustered indexes Store complete data (full row data) (similar to Oracle's index organization table)

Secondary index: stores index column values + primary key information

Summary:

Indexes can improve the efficiency of data retrieval, but the indexing must be carefully established. The necessity of each index should be carefully analyzed and a basis should be established. Because too many indexes and inadequate or incorrect indexes have no benefits for performance: Each index created on a table increases storage overhead, indexes also increase processing overhead for insert, delete, and update operations. In addition, too many composite indexes generally have no value when there is a single-field index. On the contrary, they also reduce the performance when data is added and deleted, especially for frequently updated tables, the negative impact is greater.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More