Original: SQL Server index and table architecture (clustered index)
Clustered index
Overview
The concept of index and table architecture has always been the discussion of a lot of topics, including the table of various storage forms is the focus of discussion, on the various sites also have a lot of writing about this good article, The purpose of my writing this article is also in order to all the knowledge points as far as possible to organize together with their own understanding of this aspect of a detailed article out, but also listed some of my own doubts about the place to discuss, in the expression of limited capacity, some places may not be able to express the very clear, but also hope that we forgive For the article in the wrong place also hope that you can put forward, the purpose of writing the article is to share resources; For this series will be written 5 articles, in the next few days to publish, respectively, "Clustered index Architecture", "Nonclustered Index Architecture", "Heap Architecture", "with the index containing columns", " Table organization and Index organization.
Body
In SQL Server, indexes are organized by the B-tree structure. Each page in the index B-tree is called an index node. The top node of the B-tree is called the root node. The underlying node in the index is called a leaf node. Any index level between the root node and the leaf node is collectively referred to as the intermediate level. In a clustered index, the leaf node contains the data page of the underlying table. The root node and the intermediate level node contain index pages that have index rows. Each index row contains a key value and a pointer to a data row in an intermediate-level page or leaf-level index on the B-tree. The pages in each level index are linked in a two-way link list.
- Clustered index structure in a single partition
in SQL Server, the smallest unit in which data is stored is the page, the size of the data page is a 8k,,8 page, a zone of 64K, each page can hold 8060 bytes of data , the leaf node of the clustered index stores the actual data rows, and each page of data rows is stored sequentially, Data rows are stored sequentially based on the clustered index key, so a single data table can only build one clustered index.
Non-leaf nodes (with nodes and intermediate levels) store index records, one index record contains: Key value (the key value is the field value of the clustered index column) + pointer (point to index page or data page)
Because the data is stored in the data page, the index is stored in the index page, so retrieving the data for a single indexed column is faster than retrieving the data record, because you do not need to read the data page, only the index page to retrieve the data.
- Clustered index Column Selection
Narrow columns (columns with short field lengths): Because index pages store index records, index records store index values and pointers, and in order for index columns to store more index records, we select narrow columns.
Columns that are not updated frequently: Because a pointer to an index record points to a data page, updating a clustered index also creates a change in the nonclustered index page that causes IO consumption if the data is frequently updated to cause the index page to update, and because the row pointer of the data page of the nonclustered index points to the data row of the clustered index .
Non-repeating columns: Because data records in a clustered index's data pages are stored in a clustered order, when duplicate records are inserted into a clustered column, paging occurs when the data page exceeds 8060K, and pagination inserts half the records from the original page into the new page, resulting in index fragmentation.
You can use the self-increment column as a clustered index column (here is just a suggestion that needs to be based on the actual business)
Summarize
may be my ability to express less than the reason, a lot of time for the content I want to express can not be described in the form of text, this is the difficulty of writing articles!
Note: pursuer.chen Blog:http://www.cnblogs.com/chenmh This site all the essays are original, welcome to reprint, but reprint must indicate the source of the article, and at the beginning of the article clearly to the link, otherwise reserves the right to hold responsibility. Welcome to the exchange of discussions |
SQL Server Index and table architecture (clustered index)