In the previous article, we introduced the page type of the SQL Server data file, which managed the storage structure of the table from a logical level with 96-byte header information and system tables, specifically to the table's storage structure, where SQL Server introduced concepts such as objects, partitions, heaps, or B-trees, allocation units, and so on.
Shows the table's storage organization, each table has a corresponding object ID, and contains one or more partitions, each partition will have a heap or multiple B-trees, heap or B-tree structure is reserved. Each heap or B-tree has three allocation units to hold data, namely data, LOB, row overflow, and the most used allocation unit is data. If you have LOB data or records that are longer than 8000 bytes in length, there may be additional LOB allocation units and row overflow allocation units.
Small Summary: A table can have multiple partitions, but each partition (heap/b tree) has a maximum of three allocation units, each allocation unit can have many pages, for each allocation unit within the data page, according to whether the table has an index, and the index is clustered or nonclustered, the organization has the following three kinds:
1. Heap
A heap is a table that does not contain a clustered index. There is a row in the heap's sys.partitions, and index_id = 0 For each partition used by the heap. There is only one partition, in the system table, for each allocation unit below this partition there is a connection to the index Allocation map page (IAM), which describes the area information in the IAM page.
sys.system_internals_allocation_units The columns in the system view first_iam_page The first page of a series of IAM pages that manage the allocated space for the heap in a specific partition. SQL Server uses an IAM page to move through the heap. The data pages and rows within the heap are not in any particular order, nor are they linked together. The only logical connection between data pages is the information that is recorded within the IAM page.
2. Tables with nonclustered indexes
If there is a table with only a nonclustered index and no clustered index, the corresponding index number is 2--250. So for each nonclustered index, there is a corresponding partition, in the system table and in turn, for each allocation unit below the partition, there is a connection to the root page. Data pages are connected to each other by the front and back pointers, which is a complete tree structure. At the bottom of the tree, there will be a connection to the real data, the form of the connection is the file number + page number + line number, and the real data is stored in the form of a heap. As shown in the following:
3. A table with a clustered index
The clustered index in the table, with the corresponding index number 1. It has a corresponding partition, and each allocation unit under that partition has a connection to the root page. In the case of a clustered index, the leaf node holds the actual data, rather than the connection of the nonclustered index. As shown in the following:
Nonclustered indexes have the same B-tree structure as clustered indexes, and the significant difference between them is the following two points:
Case study: Let's look at a table's storage structure, the table we use here is a production table, there are more than 100 million records, see the table's object_id, as shown in:
In this table, I have made a partition to view its partition information and can use the command shown:
As you can see, this table has a total of 16 partitions, corresponding to different indexes, and basically each partition has 1千多万条 records. You can also see from this diagram that the ID of the heap or B-tree is the same as the partition ID, and if you want to see more specific information about an index, you can use the following command, such as viewing 72057594067419136 of the information.
From this diagram, we can see that this partition has only one allocation unit, In_row_data indicates that this allocation unit is only used to store specific data, a total of 5353 pages, 5346 pages have been used, the data occupies 5320 pages.
If you want to see the location of the root page, you can use the following command:
However, it should be noted that the location of the root page shown here is 0xec0100001100, because of the stored relationship, it is parsed in reverse, that is, 0X0011000001EC, the first two bytes indicate the filegroup number, and the next 4 bytes is the number of the page, i.e. ( 1,0X01CE), Change to Decimal (1,492), and then use the DBCC PAGE command that we described in the previous section to view the page information, as shown in:
You can see the specific data, and the results of this interface will vary depending on the clustered index on the table, the nonclustered index. If you view the total number of pages and extents used by a table, you can also use the command: DBCC Showconfig, as shown in:
In the same table structure, building a clustered index does not increase the size of the table, but creating a nonclustered index adds a lot of space, and in terms of performance, the SQL Server product group has been tested for high clustered index performance under Select, Update, and delete operations, and when inserting records, Clustered indexes and nonclustered indexes have the same performance, and no clustered index affects the insertion speed, but in a production environment it is prudent to do so.
SQL Server tables and index storage structures