In the previous article, du Fei introduced the page types of SQL Server data files, the system manages the table's storage structure logically Based on the 96-byte header information and the system table.
In the previous article, du Fei introduced the page types of SQL Server data files, the system manages the table's storage structure logically Based on the 96-byte header information and the system table.
SQL Server table and index Storage Structure
Du Fei
In the previous article, we introduced the page type of SQL Server data files. The system manages the table storage structure logically by using 96-byte header information and system tables, specifically, SQL Server introduces the concepts of objects, partitions, heap, B-tree, and allocation units in the storage structure of tables.
Displays the Storage Organization of a table. Each table has a corresponding object ID and contains one or more partitions. Each partition has one heap or multiple B trees, the heap or B-tree structure is reserved. Each heap or B tree has three allocation units to store data, namely data, LOB, and row overflow. The most used allocation unit is data. If there are LOB data or records with a length of more than 8000 bytes, there may be another LOB allocation unit and row overflow allocation unit.
Summary: A table can have multiple partitions, but each partition (Heap/B tree) can have up to three allocation units, each of which can have many pages, there are three ways to organize data pages in each allocation unit based on whether the table has an index and whether the index is clustered or not:
1. Heap
Heap is a table without clustered indexes. The sys. partitions of the heap has a row. For each partition used by the heap, index_id = 0 is provided. There is only one partition. In the system table, each Allocation unit under this partition has a connection pointing to the Index Allocation Map page (IAM), which describes the partition information on the IAM page.
The column first_iam_page in sys. system_internals_allocation_units System View points to the first page of a series of IAM pages that manage heap allocation space in a specific partition. SQL Server uses the IAM page to move in the heap. The data pages and rows in the heap are not in any specific sequence and are not linked together. The only logical connection between data pages is the information recorded on the IAM page.
2. Tables with non-clustered Indexes
If a table has only non-clustered indexes but no clustered indexes, the corresponding index number is 2-2-50. For each non-clustered index, there is a corresponding partition. In the system table, there is a connection pointing to the root page for each allocation unit under this partition. Data Pages are connected by frontend and backend pointers, which is a complete tree structure. At the bottom layer of the tree, a connection points to the real data. The connection form is file number + page number + row number, and the real data is stored in heap form. As shown in:
3. Tables with clustered Indexes
The clustered index in the table. The corresponding index number is 1. It has a corresponding partition, and each allocation unit under this partition has a connection pointing to the root page. For clustered indexes, the leaf node stores real data instead of the non-clustered index. As shown in:
Non-clustered indexes and clustered indexes have the same B-tree structure, and the significant difference between them lies in the following two points:
Data rows in the base table are not sorted and stored in the order of non-clustered keys.
The leaf layer of a non-clustered index is composed of index pages rather than data pages.
Case study: Let's look at the storage structure of a table. The table we use here is a production table with more than 0.1 billion records. view the object_ID of the table, as shown in:
To view the partition information of this table, run the following command:
As you can see, this table has a total of 16 partitions, corresponding to different indexes. Basically, each partition has over records. From this figure, we can also see that the ID of the heap or B-tree is the same as the partition ID. If you want to further view the details of an index, you can use the following command, for example, view the 72057594067419136 information.
From this figure, we can see that this partition has only one allocation unit. IN_ROW_DATA indicates that this allocation unit is only used to store specific data. There are 5353 pages in total, 5346 pages are used, and 5320 pages of data are occupied.
To view the location of the root page, run the following command:
Note that the root page shown here is 0xec000001100. Due to the storage relationship, it is parsed in reverse order, that is, 0x0011000001EC, the first two bytes indicate the number of the file group, and the last four bytes indicate the number of the page, that is, (1, 0x01CE), which is changed to decimal (1,492 ), you can then use the dbcc page command described in the previous section to view the PAGE information, as shown in:
You can see the specific data. The returned results on this interface will vary depending on the clustered index and non-clustered index on the table. To view the total number of pages and partitions used by a table, run dbcc showconfig, as shown in:
In the case of the same table structure, creating a clustered index does not increase the table size, but creating a non-clustered index increases a lot of space. In terms of performance, the SQL Server product group has been tested, during select, update, and delete operations, clustered indexes have high performance. During record insertion, clustered indexes and non-clustered indexes have the same performance. No clustered indexes affect the insertion speed, but in the production environment, you must proceed with caution.
This article is from the "du Fei" blog. Please keep this source