Index Overview
Indexes, like tables, also belong to segments. It stores user data and occupies disk space like a table. However, the data storage form in the index is very different from that in the table. When you understand the index, you can imagine a book in which the content of the book is equivalent to the data in the table, and the directory in front of the book is equivalent to the index of the table. At the same time, the disk space occupied by the index is usually much smaller than that of the table. Its main function is to speed up data search and ensure data uniqueness. However, as an optional data structure, you can choose to create an index for a table or not. This is because once an index is created, it means that when Oracle performs DML (including insert, update, and delete) on the table, it must handle additional work (that is, maintenance of the index structure) and storage overhead. Therefore, when creating an index, you need to consider the query performance improvement caused by index creation, and whether it is worthwhile compared with the additional overhead.
Physically, indexes can be divided into partition and non-partition indexes, regular B-tree indexes, bitmap indexes, and reverse indexes. Among them, the B-tree index is the most common index.ArticleThis article mainly discusses the B-tree index. Therefore, all indexes mentioned below refer to B-tree indexes.
Internal Structure of the B-Tree Index:
1) leaf node: the key value of the data row and the rowid of the key value corresponding to the data row.
2) branch node: The smallest key-value prefix (minimum key prefix), used to make Branch selection between two key values (this block, all the key-Value Pair (key and rowid pair) pointing to the Child block containing the searched key value are opposite to the left and right sibling nodes (sibling) link and sort by (Key, rowid)
3) root node: a B-tree index has only one root node, which is actually the branch node at the top of the tree. It can be used to describe the structure of B-tree indexes. B Indicates the branch node, and l indicates the leaf node.
Branch node blocks (including root node blocks): The index entries contained in the branches are arranged in order (by default, they are sorted in ascending order, you can also specify the order in descending order when creating an index ). An index entry (also called each record, for example, an entry (0 L1) in a branch node block) contains two fields: the first field indicates the minimum key value of the linked index block under the current branch node block; The second field is four bytes, indicating the address of the linked index block, this address points to the following index block. The number of records that can be stored in a branch node block is determined by the size of the data block and the length of the index key value. For example, you can see that the root node blocks contain three records, namely (0 B1), (500 B2), and (1000 B3). They point to three branch node blocks. 0, 500, and 1000 indicate the minimum values of the key values linked to the three branch node blocks. B1, B2, and B3 indicate the addresses of the three branch node blocks.
Leaf node block: The index entries in the same order as the branch nodes are arranged in ascending order by default. You can also specify the leaf node block in descending order when creating an index) an index entry (or each record) contains two fields: the first field indicates the index key value, which is a value for a single column index; for multi-column indexes, multiple values are combined. The second field indicates the rowid of the record row corresponding to the key value. This rowid is the physical address of the record row in the table. If the index is created on a non-partitioned table or the index is a local index on the partitioned table, the rowid occupies 6 bytes. If the index is a global index created on the partitioned table, the rowid occupies 10 bytes.
After knowing this information, we can give an example to illustrate how to estimate the number of entries that each index can contain, and how large the indexes are produced for the table. For each index block, the default pctfree value is 10%, that is, up to 90% of this value can be used. At the same time, after 9i, this 90% cannot be used up, and only about 87% of it can be used. That is to say, the size of 8 KB data blocks that can actually be used to store index data is about 6488 (8192 × 90% × 88%) bytes.
Suppose we have a non-Partition Table named warecountd, with the number of rows being 1.3 million. This table has a column named goodid and its type is Char (8). That is to say, the length of goodid is a fixed value: 8. At the same time, a B-tree index is created on the column.
In a leaf node, each index entry occupies one row of space in the data block. Each row uses two to three bytes as the row header, which stores information such as the tag and lock type. At the same time, in the first field that represents the index's key value, each index column has one byte representing the Data Length, followed by the specific value of the column. In this example, the data contained in a row on the leaf node is roughly as follows:
As you can see, in the leaf node of this example, an index occupies 18 bytes. At the same time, we know that 8 KB data blocks can be used to store 6488 bytes of index entries. In this example, a data block can be placed at about 360 (6488/18) index entries. For the 1.3 million records in our table, about 3611 (1300000/360) leaf node blocks are required.
For an entry (one row) in a branch node, because it only needs to save the addresses of other linked index blocks, it does not need to save the specific data row, therefore, it consumes less space than leaf nodes. The space required for the minimum key value of The Link stored in a row of the branch node is the same as that of the leaf node described above, and the address of the index block is only 4 bytes, it is two bytes less than the rowid in the leaf node. The two bytes less are the space required by rowid to describe the row number in the data block. Therefore, in this example, the data contained in a row in the branch node is roughly shown in 3:
As you can see, in the branch node of this example, an index occupies 16 bytes. In the same way as the leaf nodes above, we can know that a branch index block can store about 405 (6488/16) index entries. For the 3611 leaf nodes we need, a total of about 9 branch index blocks are required.
In this way, we know that our index has two layers: the first layer is a root node, the second layer is 9 branch nodes, and the number of leaf nodes is 3611, the number of rows in the table to be pointed to is 1300000. However, note that in Oracle indexes, the level number is reversed. That is to say, if an index has N layers, the level Number of the root node is N, the level of the branch node next to the root node is N-1, and so on. In this example, the level of 9 branch nodes is 1, while that of the root node is 2.
note: in Oracle, null is defined as infinitely large, and null is not equal to null, therefore, no entries corresponding to the null value exist in the index. If a is null scan is performed on a table without other restrictions, a full table scan is performed. If the scan is not null, a full index scan is performed.