SQL Server Storage Engine--06. Traversal and maintenance of indexes

Source: Internet
Author: User

First, traverse

Each node of the index tree is a page.

There are three types of nodes in the index tree: the root node, the middle node, and the leaf node.

(1) The root node, like the middle node, contains only the entry values and the entry pointers of the next layer of nodes, which are called index nodes;

(2) The leaf node contains the data to traverse, the data is the data row in the table for the clustered index, and the nonclustered index data refers to the indexed column values and the row bookmarks.

The traversal of an index always starts at the root node, that is, the root traversal, divided into two types: Index scan and Index lookup.

(1) Index scanning, refers to the root node of the index tree, the leaf node scanning, until the hit all the data to meet the search criteria;

(2) Index lookup, refers to starting from the root node of the index tree, by the lookup value in the index node according to the routing information to jump, until the leaf node to hit the data.

The depth of the B + tree is usually less than or equal to 3, calculated as follows:

Taking a clustered index as an example, the simple calculation is as follows: 10 int column width sum is 40B, assuming that the clustered index tree each layer is two forks, a total of three layers, namely 2^0+2^1+2^2=1* (1-2^3)/(1-2) = 7 pages, 4 leaf nodes, each page 8060K can store 8060000/ 40=201500 line, multiplied by 4 = 806000 rows, if it is a three-fork, four forks, then three can store tens of millions of rows of data, of course, when the amount of data reached this level, usually we will choose the table partition, then the B-tree depth will not break through the three layer.

Therefore, the efficiency of index lookup is very high, in the query should try to construct index lookup, avoid index scan.

Second, insert

2.1. Sufficient page space

When you create or rebuild an index on a table that already has data, you can specify a fill factor, which is to reserve a certain amount of space on each node of the index tree for subsequent increases in data in the table. However, if the index is created when the table is created and the fill factor is specified, the fill factor is useless, and the database system does not deliberately preserve the space of the page.

The index page has the remaining space as follows:

Reference, insert a record with an index key value of 31 into the index tree, in the following steps:

(1) Perform a lookup operation on the index key value =31 to determine that the new record should be inserted into the leaf node L2.

(2) Check if there is enough space on the L2 to hold the current record, assuming there is enough space;

(3) Move the record 45 backward and insert a new record with an index key value of 31. After insertion, 10, 30, 31, 45 are sequential, such as:

2.2. Insufficient page space

Reference, insert a record with an index key value of 32 at this point, in the following steps:

(1) Perform the lookup operation of the index key value =32, and make sure that the new record should be inserted into the leaf node L2;

(2) Check whether there is enough space on the L2 to hold the current record, then found that there is not enough page space, at this time to split the page;

(3) Apply a new page L4 to the database system, move half of L2 data into L4, and relink the left and right nodes of the leaf, such as:

(4) At this point, the upper node also needs to generate a new leaf node pointer. The top node here is the root node, and if the upper node has no space left, it also needs to be split, where there is space, such as:

(5) Because the key value range of the current record is in the latter half of the page split, a new record with an index key value of 32 is inserted into L4 and inserted into L2 if the key value range is in the first half. If L4 does not have enough space to hold a new record with a key value of 32, L4 will continue to split the page, assuming enough space to insert the end, such as:

Third, delete

3.1. Delete the records in the leaf node

For reference 5, delete the record with index key value 32 as follows:

(1) Perform the lookup operation of index key value =32, determine the record in L4;

(2) Mark the record of the index key value =32 as a ghost, but not immediately free up space, ghost record can be used for transaction rollback, multi-version, etc.;

(3) If the shadow record space on L4 is applied, the ghost record will be erased;

(4) If the last record of the data page is also deleted, the data page will be recycled;

3.2. Delete records from non-leaf nodes

(1) When a pointer in an index node is deleted, it is not a ghost record, but it does not free up space until a new pointer is inserted, and space compression is performed;

(2) When the data row in the heap table is deleted, the page space will not be recycled, even if the free paging is still identified as assigned state and cannot be used by other objects;

Note: Theoretically, in the Brother node page free space is less than 50%, should be the sibling node merging, that is, split the inverse operation, but this may result in more frequent page merging, splitting, more expensive, so in the database system is usually not a page merge operation, unless rebuild/ Reorganize index.

Iv. Update

4.1. Overwrite Updates

Overwrite updates if the update operation is able to replace the in-place key value within the page.

4.2. Non-overwrite update

When an overwrite update cannot be made, the update operation is decomposed into a delete and insert operation.

If the new record is longer during a non-overwrite update, it will cause the movement of the data rows during the page splitting process:

(1) The move of the clustered index has no effect on the nonclustered index, because the key value of the clustered index is stored in the nonclustered index, and the split does not change the key value;

(2) The data page in the heap table splits, leaving a forward pointer at the original record to tell the nonclustered index where to find the new record;

So the movement of data rows does not cost maintenance to nonclustered indexes, and the maintenance cost of nonclustered indexes comes from changes in bookmarks:

(1) The key value of the clustered index is changed or deleted;

(2) Data rows in the heap table are deleted.

--------------------------------------------------------------------------------------
Translated from: http://qianzhang.blog.51cto.com/317608/1217346
--------------------------------------------------------------------------------------

SQL Server Storage Engine--06. Traversal and maintenance of indexes

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.