Index overview
The index, like a table, also belongs to a segment (segment). It contains the user's data, which takes up disk space as well as the table. However, the data stored in the index is very different from the data storage form in the table. In understanding the index, you can imagine a book, where the content of the book is equivalent to the data in the table, and the directory in front of the book is the equivalent of the index. At the same time, the index takes up more disk space than the table, and its main purpose is to speed up the search for data and to keep the data unique.
However, as an optional data structure, you can choose to create an index for a table or not to create one. This is because once an index is created, it means that Oracle will have to deal with additional work (that is, maintenance of the index structure) and storage overhead when it makes DML (including INSERT, UPDATE, DELETE) on the table. So when you create an index, you need to consider the query performance improvements that result from creating an index, which is worth the extra overhead.
Physically, indexes can generally be divided into: partitioned and non-partitioned indexes, regular B-Tree indexes, bitmap (bitmap) indexes, rollover (reverse) indexes, and so on. Among them, the B-Tree index is the most common index, because our article is mainly on the B-tree index of the discussion, so as long as the index refers to the B-tree index.
B-Tree Index internal structure
The B-Tree index is a typical tree structure that contains components that are mainly:
1) leaf node: The key value of the data row, the ROWID of the key value corresponding to the data row.
2) the branch node (Branch node): The minimum key-value prefix (minimum key prefix), which is used to make a branching selection between two key values (in this block), pointing to the pointer () of all the child blocks (children block) that contain the lookup key value-rowid Pairs (key and ROWID pair) are linked to their sibling nodes (sibling) and sorted in (KEY,ROWID) Order
3) root node: A B-Tree index has only one root node, which is actually the branch node at the top of the tree.
You can use a to describe the structure of a B-tree index. where b represents a branch node, and L represents a leaf node.
For a branch node block (including the root node block), it contains index entries that are sorted in order (by default ascending, or in descending order when you create the index). Each index entry (which can also be called a record) has two fields. The first field represents the minimum key value contained in the index block that is currently linked under the branch node block, and the second field is four bytes, which represents the address of the linked index block, which points to the following index block. The number of record rows that can be accommodated in a branch node block is determined by the size of the data block and the length of the index key value. For example, from one can see, for the root node block, contains three records, respectively (0 B1), (B2), (B3), they point to three branch node block. 0, 500, and 1000 respectively represent the minimum value of the key value linked by the three branch node blocks. The B1, B2, and B3 represent the addresses of the three branch node blocks pointed to.
for a leaf node block, it contains index entries in the same order as the branch nodes (the default is ascending, or can be specified in descending order when the index is created). Each index entry (which can also be called per record) also has two fields. The first field represents the key value of the index, which is a value for a single-column index, and multiple values are combined for a multicolumn index. The second field represents the rowid of the record line corresponding to the key value, which is the physical address of the record line in the table. If the index is created on a non-partitioned table or if the index is a local index on the partitioned table, the rowID occupies 6 bytes, and the rowID occupies 10 bytes If the index is a global index created on the partitioned table.
Knowing this information, we can give an example of how to estimate how many entries each index can contain, and how large the index is for the table. For each index block, the default pctfree is 10%, which means you can use up to 90% of them. At the same time, 9i later, the 90% can not be exhausted, only use about 87% of them. That is, the amount of space in a 8KB data block that can actually be used to hold the index data is approximately 6488 (8192x90%x88%) bytes.
Suppose we have a non-partitioned table with a table named Warecountd with rows of 1.3 million rows. There is a column in the table with a column named Goodid and a type of char (8), which means that the goodid length is fixed: 8. A B-Tree index is also created on this column.
In a leaf node, each index entry takes up a row of space in the data block. Each line uses 2 to 3 bytes as a wardrobe, and the wardrobe is used to store information such as tags and lock types. Also, in the first field that represents the key value of the index, each index column has 1 bytes for the data length, followed by the column-specific value. So for this example, the row in the leaf node contains roughly the same data as two:
As you can see, in the leaf node of this example, an index entry occupies 18 bytes. At the same time we know that the 8KB block of data can really be used to hold index entries in the space of 6488 bytes, in this case, a block of data can be placed about 360 (6488/18) index entries. For the 1.3 million records in our table, we need about 3611 (1300000/360) leaf node blocks.
For an entry in a branch node (one line), it takes less space than a leaf node because it simply saves the address of the other indexed block that is linked, without having to save a specific row of data. The minimum key value that is stored in a row of a branch node is the same as the leaf node described above, whereas the address of the stored index block requires only 4 bytes, which is 2 bytes less than the ROWID stored in the leaf node, and the 2 bytes that are ROWID The space required to describe the line number in the data block. Therefore, the row in the branch node in this example contains roughly the same data as three of the following:
As you can see, in the branch node of this example, an index entry occupies 16 bytes. In the same way as the leaf nodes above, we can know that a branch index block can hold about 405 (6488/16) index entries. For the 3,611 leaf nodes we need, a total of about 9 Branch index blocks are required.
So, we know that our index has 2 layers, the first layer is 1 root nodes, the second layer is 9 branch nodes, and the number of leaf nodes is 3,611, the number of rows to the table is 1300000 rows. Note, however, that in Oracle's index, the hierarchy number is reversed, that is, if an index has n tiers, the root node has a hierarchy number of N, and the branch node at the bottom of the root node is N-1, and so on. For this example, the 9 branch nodes have a hierarchy number of 1, and the root node is located at the level number 2.
Note: null is defined as infinite in Oracle, and Null is not equal to NULL, so the index does not have entries corresponding to null values. If an IS null scan is performed on a table without additional constraints, it will be a full table scan, if it is a NOT NULL scan will be a full index scan
Here is just as a study to discuss how to estimate, learn one thing, of course we have to know it, but also know why, the actual environment can use explain plan for viewing the execution plan to create the index, the index size, the creation time of the pre-judgment, specifically see
http://czmmiao.iteye.com/blog/1471756
Access to B-Tree indexes
Oracle has two types of I/O operations when Oracle processes need to access data blocks in data files:
1) random access, each time reading a block of data (by waiting for the event "DB file sequential read" reflected).
2) sequential access, which reads multiple blocks of data each time (by waiting for the event "db file scattered read" to manifest).
The first approach is to access the data blocks in the index, while the second way of I/O operations is a full table scan. Here comes a question, for
What random access corresponds to the DB file sequential read wait event, while sequential access corresponds to the db file scattered read wait event? This seems to be the reverse, the random access should be scattered (scattered), and sequential access should be sequential (sequential). In fact, wait events are primarily named based on how physical I/O blocks are actually acquired, rather than in the logical way that they are in the I/O subsystem. This is explained in the way that you can get an indexed block of data.
In fact, although the B-tree index is a tree-like stereoscopic structure, its corresponding to the data file arrangement of course is a flat form, that is, as follows.
/roots/Branches/branches/leaves/.../leaves/branches/leaves/leaves/.../leaves/branches/leaves/leaves/.../leaves/branches/...
Therefore, when Oracle needs to access an index block, it is bound to jump on this structure.
When Oracle needs to obtain an index block, first starting from the root node, according to the key value to be found, so that the next level of the branch node, and then access the next level of the branch node, again according to the key value of the next level of access to the branch node, and so on, eventually access to the lowest leaf node. It can be seen that when the physical I/O block is obtained, it is one after the other, in sequence, serially. In the process of obtaining the final physical block, we cannot read more than one block at a time because we do not know which block to access next if we do not get the current block. Therefore, when the data block is accessed on the index, it corresponds to the DB file sequential read wait event, which is rooted in the fact that we jump from one index block to another in order to find the final block of the index.
Then for a full table scan, there is no case to access the previous block before accessing the next block. When the full table is scanned, Oracle knows to access all the data blocks, so the only problem is to access them as efficiently as possible. As a result, Oracle can be synchronized in several batches, while acquiring multiple blocks of data. These batches of data blocks may be physically dispersed in the table, so they correspond to the db file scattered read wait event.
Effects of DML on B-Tree indexes
INSERT
During each insert operation, the keyword must be inserted at the position of the correct leaf node. If the leaf node is full and cannot hold more keywords, the leaf node must be split. There are two ways to split:
1) If the new keyword value is the largest of all the keywords in all the old leaf node blocks, then all the keywords will be split according to the 99:1 ratio so that only new keywords are stored in the new leaf node block, and all other keywords (including all deleted keywords) remain in the old leaf node block.
2) If the new keyword value is not the largest, then all the keywords will be split according to the scale of 50:50, when each leaf node block (old and new) will each contain the original leaf node half of the keyword.
This split must be passed up to the parent node through a new portal pointing to the new leaf node. If the parent node is full, the parent node must also be split, and the split will need to be routed up to the parent node. At this point, if the parent node is also full, the process will continue. In this way, a split may eventually be transferred to the root node. If the root node is full, the root nodes will also be split. When the root node splits, it is when the height of the tree increases. The way the root node splits is compared with the way other nodes split, and the processing in the physical location is different. When the root node splits, the original root nodes are split into branch nodes or leaf nodes, saved in a new block, and the new root node information is saved to the original root junction block, which is done as a result of avoiding the relatively large overhead of modifying the data dictionary.
Note: Oracle now employs a balancing algorithm, which normally does not create an unbalanced tree even if the index keyword grows. When the index keyword grows, causing the tree level to grow in one Direction, Oracle automatically indexes the index rollover to maintain the balance of the index, which, of course, consumes resources
At each level of the index, the block head of the leftmost node of each layer has a pointer to the leftmost block on the lower level, which facilitates fast full scan's quick positioning of the leftmost leaf node.
Each split process takes a certain amount of overhead, especially for physical hard disk I/O actions. In addition, before splitting, Oracle must find an empty block to hold the split. You can use the following steps to find the action of an empty block:
1) in the free list of indexes (Free-list, also known as the Idle list), a free block is found, and multiple idle lists can be defined for an index by the Create/alter Index command. The index free list does not help Oracle find a block that can be used to hold new keywords that will be inserted. This is because the keyword value cannot be randomly stored in the first "free" leaf node block available in the index, and the value must be placed in a particular leaf node block after the appropriate sort. Only the free list of indexes is required during block splitting, and each free list contains a list of links to "empty" blocks. When more than one free list is defined for an index, a free block is first scanned from the list of spaces assigned to the process. If the required free block is not found, the action of the free block is scanned from the main idle list.
2) If no free blocks are found, Oracle will attempt to allocate another extension segment. If there is no more free space in the tablespace, Oracle will generate error ORA-01654.
3) If the required free block is found through the above steps, the high water mark (HWM) of the index will be increased.
4) The free block found will be used to perform the split action.
One of the issues to be aware of when creating the b* tree index is to avoid splitting at run time, or to split ("pre-split") during the index creation process so that the split can be quickly hit to avoid runtime insert actions. Of course, these splits are not confined to the insert action, and it is possible that the split action will occur during the update process.
UPDATE
Index updates are completely different from table updates, where the data is changed inside the block (assuming there is enough space in the data block to allow such a change), but in an index update, if a keyword changes, its position in the tree needs to change. Keep in mind that a keyword has only one position in the b* tree. Therefore, when a keyword changes, the old table entry for the keyword must be deleted, and a new keyword will need to be created on a new leaf node. The old table entries may never be reused, because only in very special cases can Oracle reuse the key table slot, for example, the newly inserted keyword is exactly the one that was deleted (including data type, length, and so on). (a block is reused here, but when the same value is completely inserted, it is not necessarily inserted in the original deleted position, but inserted in the original block, possibly a new position in the block.) Because of this, the records that are saved in the index block may not be in the order of the keywords, and will change as the update is done. So, how likely is this to happen? Many applications use a sequence of numbers to produce the number keyword (especially the primary keyword). Unless they use the recycle option, the sequence will not produce exactly the same number two times. In this way, the deleted space in the index has not been used. This is why the size of the table is decreasing or at least remaining constant during a large-scale deletion and update process, but the index is increasing.
DELETE
When a record in the table is deleted, its corresponding index entry in the index is not physically deleted, only a delete tag is made. When a new index entry enters an index leaf node, Oracle checks to see if there is an index entry marked for deletion in the leaf node, and if so, all index entries with the delete tag are physically deleted from the leaf node.
When a new index entry enters the index, Oracle reclaims all currently emptied leaf nodes (all index entries in the leaf node are set to delete tags) and becomes the available index block again.
Although the amount of space occupied by the deleted index entries can be reused in most cases, there are still situations that can cause the index space to be wasted and result in a large number of index blocks but with few index entries, at which point the index is considered fragmented. The main causes of fragmentation of the index are:
1, unreasonable, high pctfree. Obviously, this will result in a reduction in the available space for the index block.
2, the index key value continues to increase (for example, using sequence to generate the key value of the serial number), while the index key values in sequential deletion, this can cause the index fragmentation. Because we knew earlier that a partial index entry was removed from an index block, the space can be retracted only if a key value enters the index block. The constant increase in the index key value will always be inserted in the previous index block, so that the space in the index can hardly be recovered, and only the index entries contained in the total deletion, the index block can be re-exploited.
3, often deleted or updated key values, which are almost no longer inserted, this situation is similar to the above situation.
Summary
Through the above analysis of the B-tree, we can draw the following application criteria:
1. Avoid indexing columns that are likely to produce very high update actions.
2. Avoid indexing more than one column in a table that is often deleted. If possible, index only the primary key and/or column that will be deleted on such a table. If it is unavoidable to index multiple columns, you should consider partitioning the tables based on these columns and then performing truncate actions (not delete actions) on each of these partitions. When used with the drop storage phrase, truncate simulates the process of deleting tables and indexes and recreating tables and indexes by re-setting the high water mark.
3. Avoid creating b* tree indexes for those columns that are not highly unique. Such low selectivity will result in the dense nature of the tree node blocks, resulting in large-scale index scans that occur as a result of index tiling (flat). The higher the degree of uniqueness, the better the performance because it reduces range scanning and may even replace range scanning with a unique scan.
4) Null values are not stored in a single-column index. For a composite index, storage of values is required only if a column is not empty. You should keep this in mind when you create an IS null or is not NULL phrase for a DML statement.
5) is null does not cause an index scan, and an is not null with no restrictions may cause a full index scan.
This article does not carry on the index internal structure The dump experiment, all is the theory research, the interested friend can search the related document on the net to carry on the research study
Reference to: http://btxigua.itpub.net/post/34419/406433
http://space.itpub.net/?uid-9842-action-viewspace-itemid-324139
http://space.itpub.net/?uid-9842-action-viewspace-itemid-312607
http://space.itpub.net/?uid-9842-action-viewspace-itemid-324586
This article original, reproduced please indicate the source, the author
If there is any mistake, please correct me.
Email: [Email protected]