Abstract:This article gives a comprehensive introduction to the structure and internal management of B-tree indexes. At the same time, I have discussed some widely-spread statements related to B-tree indexes, such as the impact of deleting records on indexes, and regular index reconstruction can solve many performance problems.
1. concepts related to B-tree indexes
Indexes, like tables, also belong to segments. It stores user data and occupies disk space like a table. Only
However, the data storage format in the index is very different from that in the table. When you understand the index, you can imagine a book in which the content of the book is equivalent to the data in the table, and the directory in front of the book is equivalent to the index of the table. At the same time, the disk space occupied by the index is usually much smaller than that of the table. Its main function is to speed up data search and ensure data uniqueness.
However, as an optional data structure, you can choose to create an index for a table or not. This is because once an index is created, it means that when Oracle performs DML (including insert, update, and delete) on the table, it must handle additional work (that is, maintenance of the index structure) and storage overhead. Therefore, when creating an index, you need to consider the query performance improvement caused by index creation, and whether it is worthwhile compared with the additional overhead.
Physically, indexes can be divided into partition and non-partition indexes, regular B-tree indexes, bitmap indexes, and reverse indexes. Among them, the B-tree index is the most common index, because our article mainly discusses the B-tree index, so as long as we talk about the index below, it refers to the B-tree index.
The B-tree index is a typical tree structure. Its main components are:
1) leaf node: contains entries that direct to the data rows in the table.
2) branch node: the entries in the table point to other branch nodes or leaf nodes in the index.
3) root node: a B-tree index has only one root node, which is actually the branch node at the top of the tree.
It can be used to describe the structure of B-tree indexes. B Indicates the branch node, and l indicates the leaf node.
For branch node blocks (including root node blocks), the index entries contained in the blocks are arranged in order (by default, they are listed in ascending order, you can also specify the order in descending order when creating an index ). Each index entry (or each record) has two fields. The first field indicates the minimum key value of the linked index block under the current branch node block; The second field is four bytes, indicating the address of the linked index block, this address points to the following index block. The number of records that can be stored in a branch node block is determined by the size of the data block and the length of the index key value. For example, you can see that the root node blocks contain three records, namely (0 B1), (500 B2), and (1000 B3). They point to three branch node blocks. 0, 500, and 1000 indicate the minimum values of the key values linked to the three branch node blocks. B1, B2, and B3 indicate the addresses of the three branch node blocks.
For leaf node blocks, the index entries in the same order as the branch nodes are arranged in ascending order, you can also specify the order in descending order when creating an index ). Each index entry (or each record) also has two fields. The first field indicates the key value of the index, which is a value for a single column index, and multiple values are combined for multiple column indexes. The second field indicates the rowid of the record row corresponding to the key value. This rowid is the physical address of the record row in the table. If the index is created on a non-partitioned table or the index is a local index on the partitioned table, the rowid occupies 6 bytes. If the index is a global index created on the partitioned table, the rowid occupies 10 bytes.
After knowing this information, we can give an example to illustrate how to estimate the number of entries that each index can contain, and how large the indexes are produced for the table. For each index block, the default pctfree value is 10%, that is, up to 90% of this value can be used. At the same time, after 9i, this 90% cannot be used up, and only about 87% of it can be used. That is to say, the size of 8 KB data blocks that can actually be used to store index data is about 6488 (8192 × 90% × 88%) bytes.
Suppose we have a non-Partition Table named warecountd, with the number of rows being 1.3 million. This table has a column named goodid and its type is Char (8). That is to say, the length of goodid is a fixed value: 8. At the same time, a B-tree index is created on the column.
In a leaf node, each index entry occupies one row of space in the data block. Each row uses two to three bytes as the row header, which stores information such as the tag and lock type. At the same time, in the first field that represents the index's key value, each index column has one byte representing the Data Length, followed by the specific value of the column. In this example, the data contained in a row on the leaf node is roughly as follows:
As you can see, in the leaf node of this example, an index occupies 18 bytes. At the same time, we know that 8 KB data blocks can be used to store 6488 bytes of index entries. In this example, a data block can be placed at about 360 (6488/18) index entries. For the 1.3 million records in our table, about 3611 (1300000/360) leaf node blocks are required.
For an entry (one row) in a branch node, because it only needs to save the address of the other index blocks linked to it, it does not need to save the specific data row, therefore, it consumes less space than leaf nodes. The space required for the minimum key value of The Link stored in a row of the branch node is the same as that of the leaf node described above, and the address of the index block is only 4 bytes, it is two bytes less than the rowid in the leaf node. The two bytes less are the space required by rowid to describe the row number in the data block. Therefore, in this example, the data contained in a row in the branch node is roughly shown in 3:
As you can see, in the branch node of this example, an index occupies 16 bytes. In the same way as the leaf nodes above, we can know that a branch index block can store about 405 (6488/16) index entries. For the 3611 leaf nodes we need, a total of about 9 branch index blocks are required.
In this way, we know that our index has two layers: the first layer is a root node, the second layer is 9 branch nodes, and the number of leaf nodes is 3611, the number of rows in the table to be pointed to is 1300000. However, note that in Oracle indexes, the level number is reversed. That is to say, if an index has N layers, the level Number of the root node is N, the level of the branch node next to the root node is N-1, and so on. In this example, the level of 9 branch nodes is 1, while that of the root node is 2.