1. Each node that traverses the index tree is a page. The index tree has three types of nodes: root node, intermediate node, and leaf node. (1) The root node and the intermediate node only contain the entry value and entry pointer of the next node.
1. Each node that traverses the index tree is a page. The index tree has three types of nodes: root node, intermediate node, and leaf node. (1) The root node and the intermediate node only contain the entry value and entry pointer of the next node.
I. Traversal
Each node in the index tree is a page.
The index tree has three types of nodes: root node, intermediate node, and leaf node.
(1) like an intermediate node, the root node only contains the entry value and entry pointer of the next node. They are called index nodes;
(2) The leaf node contains the data to be traversed and the server space. For clustered indexes, data is the data rows in the table, and for non-clustered index data, the index column values and row bookmarks are used.
Index traversal always starts from the root node, that is, root traversal. There are two types: Index scanning and index search.
(1) index scanning refers to scanning leaf nodes one by one from the root node of the index tree until hitting all data that meet the search criteria;
(2) index search refers to jump from the root node of the index tree according to the search value in the index Node Based on Route information until the leaf node hits data.
The depth of the B + tree is usually less than or equal to 3. The calculation is as follows:
Taking clustered index as an example, a simple calculation is as follows: the sum of the width of the 10 INT columns is 40B. Assume that each layer of the clustered Index Tree is a binary tree with three layers in total, 2 ^ 0 + 2 ^ 1 + 2 ^ 2 = 1*(1-2 ^ 3)/(1-2) = 7 pages, 4 leaf nodes, each 8060000/40 K page can store 201500 = 806000 rows, multiplied by 4 = rows. If it is a three-way, four forks, the three layers can store tens of millions to hundreds of millions of rows of data, of course, when the data volume reaches this level, we usually select table partitions, so the depth of the B-tree will not break through the three layers.
Therefore, the efficiency of index search is very high. You should try to construct index search in the query to avoid index scanning.
Ii. insert
2.1. Ample page Space
When creating or recreating an index on an existing table, you can specify a fill factor, that is, you can reserve a certain amount of space on each node of the index tree for subsequent data use in the table. However, if an index is created and a fill factor is specified during table creation, the fill factor is useless and the database system does not deliberately reserve the page space.
The remaining space on the index page is shown as follows:
Reference: Insert a record with an index key value of 31 to the index tree as follows:
(1) perform the search operation with the index key value = 31 to confirm that the new record should be inserted into leaf node L2.
(2) check whether there is sufficient space on L2 to store the current record. Here we assume there is sufficient space;
(3) Move record 45 backward and insert a new record whose index key value is 31. After insertion, 10, 30, 31, and 45 are still sequential. Hong Kong servers, such:
2.2 insufficient page Space
In this case, insert a record with an index key value of 32. The procedure is as follows:
(1) perform the search operation with the index key value = 32 to confirm that the new record should be inserted into leaf node L2;
(2) check whether there is sufficient space on L2 to store the current record. If there is not enough page space, the Hong Kong server needs to split the page;
(3) apply for a new page L4 from the database system, move half of L2 data to L4, and re-link the Left and Right nodes of the leaves, such:
(4) At this time, the upper-layer node also needs to generate a pointer to a new leaf node. The upper-layer node is the root node. If the upper-layer node does not have any space available, it also needs to be split. There is space available here, for example:
(5) because the key value range of the current record is located in the last half of the split page, insert the new record with the index key value 32 into L4. If the key value range is located in the first half, then it is inserted into L2. If the L4 space is insufficient to store the new record whose key value is 32, L4 will continue to split the page. Here we assume that the space is sufficient and the insertion ends, for example:
Iii. Delete
3.1 delete records from leaf nodes
To delete a record whose index key value is 32, follow these steps:
(1) perform the search operation with the index key value = 32 to confirm that the record is in L4;
(2) mark the record with the index key value = 32 as a virtual shadow, but does not immediately release the space. The virtual shadow record can be used for transaction rollback and multi-version;
(3) If the virtual shadow record space on L4 is applied for use, the virtual shadow record will be erased;
(4) If the last record on the data page is deleted, the data page is recycled;
3.2 delete records from non-leaf nodes
(1) When the pointer in the index node is deleted, it is not a virtual shadow record, but space is also not released until a new pointer is inserted;
(2) After the data rows in the heap table are deleted, the page space will not be recycled. Even idle pages are still marked as allocated and cannot be used by other objects;
Note: theoretically, when the free space on the sibling node page is less than 50%, the sibling node should be merged, that is, the reverse split operation, however, this may result in more frequent page merging and splitting at a higher cost. Therefore, page merging is usually not performed in the database system unless rebuild/reorganize indexes.
Iv. Update
4.1 overwrite updates
If the update operation can replace the in-situ key value on the page, overwrite the update operation.
4.2 non-overwriting update
When an update cannot be overwritten, the update operation is divided into Delete and insert operations.
If the new record is too long during non-overwriting update, the data row will be moved during the page split process:
(1) Moving clustered Indexes has no impact on non-clustered indexes, because the key values of clustered indexes are stored in non-clustered indexes, and the split does not change the key values;
(2) the data page split in the heap table will leave a forward pointer in the original record to tell the non-clustered index where to find a new record;
Therefore, moving data rows will not bring about maintenance costs for non-clustered indexes. The maintenance costs of non-clustered indexes come from the changes in bookmarks:
(1) The key value of the clustered index is changed or deleted;
(2) The data rows in the heap table are deleted.
This article is from the "SQL Server DBA" blog. Be sure to keep this source