SQL Server big Talk Storage structure (2)

Last Update:2017-06-06 Source: Internet

Author: User

Tags create index

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Reading directory (content)

1 Row Records how to store

1.1 Heap Table
1.2 Clustered Index Table

2 Nonclustered index structure
3 Nonclustered index key value contents

3.1 Nonclustered indexes on a heap table
3.2 Nonclustered indexes for clustered index tables (unique)
3.3 Nonclustered indexes for clustered index tables (not unique)

4 Nonclustered indexes how to find pages

If reproduced, please indicate the source of the blog: www.cnblogs.com/xinysu/, copyright belongs to the blog Garden su Home radish all. Hope you support! Previous post in this series link: SQL Server big Talk storage structure (1) _ Data page type and page instruction analysis back to the top (go to top) 1 Row Records how to storeTwo concepts are introduced here: the heap and the clustered index table. This section refers to MSDN. 1.1 Heap Table heap table, table with no clustered index, can create one or more nonclustered indexes. It is not stored according to a rule, in general, in the order in which rows are recorded in the table, but due to performance requirements, inbound data may be moved in different regions. Like a heap of sand, there is no clear order of organization. There is a row in the heap's sys.partitions, and index_id = 0 for each partition used by the heap. By default, a heap has one partition. When a heap has multiple partitions, each partition has a heap structure that contains the data for that particular partition. For example, if a heap has four partitions, there are four heap structures, and each partition has a heap structure. Depending on the data type in the heap, each heap structure will have one or more allocation units to store and manage the data for a particular partition. Each partition in each heap has at least one In_row_data allocation unit. If the heap contains large object (LOB) columns, then each partition of the heap will also have a Lob_data allocation unit. If the heap contains a variable length column that exceeds the row size limit of 8,060 bytes, there is also a Row_overflow_data allocation unit in each of its partitions. Sys.system_internals_allocation_units column first_iam_page in the system view points to the first IAM page in the IAM page chain that manages the space allocated to the heap in a particular partition. SQL Server uses IAM pages to move between heaps. The data pages and rows within the heap are not in any particular order, nor are they linked together. The only logical connection between data pages is the information that is recorded within the IAM page.

A table that has a clustered index, called a clustered index table, is stored by the collation of its clustered index, but note here that in one page, not the row records are sorted by their clustered index collation, but the row offsets are stored according to their collation. 1.2 Clustered Index Table in SQL Server, the index is organized by the B-tree structure. Each page in the index B-tree is called an index node. The top node of the B-tree is called the root node. The underlying node in the index is called a leaf node. Any index level between the root node and the leaf node is collectively referred to as the intermediate level. In a clustered index, the leaf node contains the data page of the underlying table. The root node and the intermediate level node contain index pages that have index rows. Each index row contains a key value and a pointer to a data row in an intermediate-level page or leaf-level index on the B-tree. The pages in each level index are linked in a two-way link list. Clustered index has a row in sys.partitions, where the index uses index_id = 1 for each partition. By default, a clustered index has a single partition. When a clustered index has more than one partition, each partition has a B-tree structure that contains data related to that particular partition. For example, if the clustered index has four partitions, there are four B-tree structures, each with a B-tree structure. Depending on the data type in the clustered index, each clustered index structure will have one or more allocation units where the relevant data for the specific partition will be stored and managed. There is at least one In_row_data allocation unit in each partition of each clustered index. If a clustered index contains a large object (LOB) column, there is also a Lob_data allocation unit in each of its partitions. If a clustered index contains a variable length column that exceeds the row size limit of 8,060 bytes, there is also a Row_overflow_data allocation unit in each of its partitions. The pages and rows in the data chain are sorted by the clustered index key value. All insert operations are performed when the key value in the inserted row matches the sort order in the existing row. Explicitly the structure in a single partition of a clustered index. Thus, it can be seen that the heap table does not have a specific storage order, typically stored in the order of inserts, but sometimes because of performance requirements, the data will be stored around While the data rows of the clustered index table are stored in the order of the clustered keys, the leaf nodes are row records. Top (go to top) 2 Nonclustered index structureA nonclustered index can be created, whether it is a heap table or a clustered index table. Nonclustered index pages are also b-tree structures, but there are several differences: nonclustered indexes do not affect the order in which the underlying tables are stored, and their leaf nodes are composed of index pages rather than data pages. When you need to find a row record through a nonclustered index, first locate the B-tree tree where the nonclustered index is located, find the corresponding leaf node, and then, based on the corresponding row locator on that key value, find the row record location that it points to. So, what about the row locator? This also needs to analyze the content of the key value of the nonclustered index before it can be clearly understood, as described in the following analysis cases. Top (go to top) 3 Nonclustered index key value contentsCreate 3 tables: Heap table, clustered index non-unique table and clustered index Unique table, and create nonclustered index, insert part of data at the same time. --Creating a heap table CREATE TABLE tb_heap (ID int, name varchar (+), age int)--Create a clustered index (non-unique) Table creation table Tb_clu_no_unique (ID int identity (+), name varchar (+), age int) Create CLUSTERED index ix_clu_id on Tb_clu_no_unique (ID)--Create a clustered index and key value Unique Table CREATE TABLE tb_p K (ID int primary key identity (), name varchar (+), age int)--Create nonclustered index creation index ix_tb_pk_name on TB_PK (name) Create IND Ex ix_tb_heap_name on TB_HEAP (name) create index ix_tb_clu_no_unique_name on Tb_clu_no_unique (name)--build data insert INTO Tb_ PK (name,age) Select Name,cast (rand () *100 as int) from master.dbo.spt_values where name isn't Nullinsert into Tb_clu_no_un Ique (name,age) Select Name,age from Tb_pkinsert to Tb_heap (id,name,age) Select Id,name,age from TB_PK 3.1 Nonclustered indexes on a heap table#会话窗口查看ind, the need to open 3604 trace dbcc TRACEON (3604) DBCC IND (' dbpage ', ' tb_heap ', 2) can draw these conclusions:

pageid=238 is an IAM page, judging by: iamfid=null;
The Tb_heap index on the ix_tb_heap_name has 2 layers on the B tree structure, which is based on the following: Indexlevel maximum value is 1;
In the B-tree tree, the root page is pageid=239, and the left node leaf of the leaf node is 235

According to Indexlevel, Nextpagepid and Prevpagepid, you can draw the Ix_tb_heap_name data structure as follows (the drawing tool is broken, with drawing gadgets, this image ugly out of the sky): Select Pageid= 235, to parse the structure on the nonclustered index page. DBCC TRACEON (3604) DBCC page (' Dbpage ', 1,235, 3) looks at ' messages ' and can see that this is the index page, which currently stores 260 row index key values, the page free space is 12 bytes, and the free space starts with the No. 7660 byte. Look at the ' results ' as follows: You can see the row record for each row on this page, and you can see that the key value of the nonclustered index has 2 parts: name and HEAD Rid,name because the column is a nonclustered index, so what should be stored, what is RID? RIDs can be queried from the DBCC page, or through pseudo-column queries:%%physloc%%. The select *,%%physloc%% as rid from Tb_heap RIDs is actually used to uniquely identify each row of data in the heap table, accounting for 8 bytes, identifying the row in the following format: {File ID}:{page ID}:{slot ID}, text Part number: Data page number: slot, which represents a single row of data from a stored angle. But from the results of DBCC, this is a 16 binary value, how to convert it? Conversion rules: divided into 8 bytes, the front 4bytes is the page id-> the middle 2bytes is the file id-> the last 2bytes is the slot id-> reverse order 10 binaryUse the RID to experiment with how to reverse parse. --1 is divided into 8 bytes E9 00 00 00 01 00 95 00--2 Front 4bytes for page idE9 00 00 00--3 Middle 2bytes for file id01 00--4 last 2bytes for slot id95 00--5 reverse order and Take 10 binary PageID, after the reverse order for the xx xx E9, the decimal is 16*14+9=233fileid, after the reverse order is 00 01, the decimal is 1slotid, 00 95 after the reverse order, the decimal 149 can be deduced, name= ' backup device ' , there is a row of rows of data stored in the first file in the No. 233 page of the 149-slot DBCC page (' Dbpage ', 1,233, 3)

As a result, you can launch: in a heap table, the key value of a nonclustered index contains two parts: an index column and a row record that Rid,rid uses to find the index key value. 3.2 Nonclustered indexes for clustered index tables (unique)#会话窗口查看ind, you need to open 3604 trace dbcc TRACEON (3604) DBCC IND (' dbpage ', ' tb_pk ', 2) according to 2.1 inference, the same can be concluded:

pageid=121 is an IAM page, judging by: iamfid=null;
The TB_PK index on the ix_tb_pk_name has 2 layers on the B tree structure, which is based on the following: Indexlevel maximum value is 1;
In the B-tree tree, the root page is pageid=126, and the left node of the leaf node is 120.

According to Indexlevel, Nextpagepid and Prevpagepid, you can draw the data structure of the Ix_tb_pk_name as follows: Select pageid=120 to analyze the structure on the nonclustered index page. DBCC TRACEON (3604) DBCC page (' Dbpage ', 1,120, 3) looks at ' messages ' and can see that this is the index page, which currently stores 296 row index key values, the page free space is 86 bytes, and the free space starts with the No. 7514 byte. Viewing ' results ', it can be found that in a clustered index and a unique table, a nonclustered index has 2 parts: Key-value columns + primary-key columns。 This is relatively well understood, because the clustered index key value uniquely identifies the row records for each row in a table that has a clustered unique index, so on a nonclustered index, you only need to include the two parts. 3.3 Nonclustered indexes for clustered index tables (not unique)#会话窗口查看ind, you need to open 3604 trace dbcc TRACEON (3604) DBCC IND (' dbpage ', ' Tb_clu_no_unique ', 2) according to 2.1 inference, the same can be concluded:

pageid=172 is an IAM page, judging by: iamfid=null;
The TB_PK index on the Tb_clu_no_unique has 2 layers on the B tree structure, which is based on the following: Indexlevel maximum value is 1;
In the B-tree tree, the root page is pageid=174, and the left node leaf of the leaf node is 171

Select pageid=171 to analyze the structure on the nonclustered index page. DBCC TRACEON (3604) DBCC page (' Dbpage ', 1,171,3) View ' messages ' , you can see, this is the index page, which currently stores 298 rows of index key values, the page free space 4 bytes, The free space starts with the No. 7592 byte. View ' results ' , note the column followed by parentheses ' (key) ', which indicates a key-value pair component, where you find that there are no previously seen key-value columns in the Uniquifier column. So, uniquifier column, what is this column for? Here, to better understand the Uniquifier column, you need to create a new table, insert a small number of row records that repeat the clustered index key value. CREATE TABLE tb_clu_no_unique_2 (ID int , name varchar (+), age int.) Create clustered index ix_clu_i_2 on TB_CLU _no_unique_2 (ID) CREATE INDEX ix_tb_clu_no_unique_2_name on tb_clu_no_unique_2 (NAME) INSERT into tb_clu_no_unique_2 (ID , name,age) Select 1, ' A ', 3;insert into tb_clu_no_unique_2 (id,name,age) Select 1, ' B ', 3;insert into tb_clu_no_unique_2 (ID , name,age) Select 2, ' C ', 3;insert to Tb_clu_no_unique_2 (id,name,age) Select 2, ' D ', 3;insert into tb_clu_no_unique_2 (ID , Name,age) SELECT 2, ' E ', 3;dbcc TRACEON (3604) DBCC IND (' dbpage ', ' tb_clu_no_unique_2 ', 2) DBCC PAGE (' dbpage ', 1,306, 3) can be found, in a clustered index and non-unique table, the nonclustered index has 3 parts: Key-Value column + primary-key column +uniquifier column。 A clustered non-unique index is established, and the table is stored in a clustered index order, but a clustered index alone cannot uniquely identify row records for each row, so you need to add a uniquifier column to uniquely identify it. Summary:

Nonclustered index key value contents of heap table: indexed column +rid
Nonclustered index key value content for clustered and unique indexed tables: Index column + primary key column
Nonclustered index key value content for clustered and non-unique indexed tables: Index column + primary key column +uniquifier column

Top (go to top) 4 Nonclustered indexes How to find pagesDepending on the second part, it is clear that the components of each type of nonclustered index are available. In the heap table, a nonclustered index is directly physically located based on the RID column within its key value, PageID is found from Fileid, and SlotID is found to locate the row record, which is the so-called bookmark lookup, based on RID lookup. In the clustered and unique index table, the nonclustered index finds the b-tree of the clustered index based on the clustered index column inside its key value, and the key value of the clustered index is found according to the B-tree tree, and the leaf node under the key value is the row record. In a clustered non-unique index table, the nonclustered index finds the b-tree of the clustered index based on the clustered index column within its key value, finds the key value of the clustered index according to the B-tree tree, and here is a bit different, depending on the key value found, the leaf node under the key value may have more than one row of records, this time, You need uniquifier to identify the row records. Reference document: Https://msdn.microsoft.com/zh-cn/library/mt786796.aspx SQL Server Performance Tuning combat

SQL Server big Talk Storage structure (2)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More