In this article, I want to talk about how SQL Server stores indexes on variable-length columns. First we create a table with variable-length columns, define the primary key above, define the clustered index above, and insert 80,000 records inside:
1 --Create a new table2 CREATE TABLECustomers3 (4CustomerNameVARCHAR(255) not NULL PRIMARY KEY,5FillerCHAR(138) not NULL6 )7 GO8 9 --Insert 80.000 RecordsTen DECLARE @i INT = 1 One while(@i <= 80000) A BEGIN - INSERT intoCustomersVALUES - ( the 'CustomerName' + CAST(@i as VARCHAR), - 'Filler' + CAST(@i as VARCHAR) - ) - + SET @i += 1 - END + GO
as we can see from the code, I have established a PRIMARY KEY constraint on the varchar (255) column, and SQL Server enforces this column as a unique clustered index. Next we get the relevant physical information for the clustered index through the DMV sys.dm_db_index_physical_stats:
1 --Retrieve Physical information about the clustered index2 SELECT * fromsys.dm_db_index_physical_stats3 (4 db_id('allocationdb'),5 object_id('Customers'),6 NULL,7 NULL,8 'detailed'9 )Ten GO
As can be seen from the output, in the index page, the value of the Min_record_size_in_bytes column is the value of the 7,max_record_size_in_bytes column is 28. We can conclude that, within the index record, the clustered key is stored in a variable-length column. We set up a Help table to store the output information of the DBCC IND for further analysis.
1 --Create a helper table2 CREATE TABLEhelpertable3 (4PagefidTINYINT, 5PagepidINT, 6IamfidTINYINT, 7IampidINT, 8ObjectIDINT,9IndexIDTINYINT,TenPartitionNumberTINYINT, OnePartitionIDBIGINT, AIam_chain_typeVARCHAR( -), -PageTypeTINYINT, -IndexlevelTINYINT, theNextpagefidTINYINT, -NextpagepidINT, -PrevpagefidINT, -PrevpagepidINT, + PRIMARY KEY(Pagefid, Pagepid) - ) + GO A at --Write everything in a table for further analysis - INSERT intoHelpertableEXEC('DBCC IND (allocationdb, Customers, 1)') - GO - - --Retrieve The root index page (1 page) - SELECT * fromhelpertable in WHEREIndexlevel= 2 - GO
My root page here is 15058, we use the DBCC PAGE command to view the next root page (remember to execute DBCC TRACEON (3604) first).
1 DBCC TRACEON (3604)2GO3--Dump out the root index page4 DBCC1150581)5GO
The numbers shown below are as follows:
00000000: 00010001 001b0043 7573746f†&.; ........ Custo 00000010: 333533†††††††††††††mername1353
Let's analyze these 16 binary values:
95020000 0100 0100 1b00 43757374 6f6d6572 4e616d65 31333533
- 26 The first 1 bytes represent the status bit
- 95020000 This 4 bytes represents the child page ID pointed to by the index record (Child-page-id)
- 0100 These 2 bytes represent the sub-file ID pointed to by the index record (Child-file-id)
- 0100 These 2 bytes represent a variable-length column number
- 1b00 This 2 bytes represents the offset from the end of each variable-length column. Each variable-length column requires 2 bytes. This is consistent with storing the variable length column in the data page. Here we have 1 variable-length columns, so SQL Server requires 1 2 byte offsets of--27 byte. This means that the offset of the next byte until byte is the part of our variable-length column (the clustered key).
- 43757374 6f6d6572 4e616d65 31333533 The 16 binary value of the aggregation key, which is the CustomerName column.
From the above explanation, we can see that the SQL Server storage variable-length index column format is the same as the data page in which the variable-length columns are stored. But you have to know that there is a little extra overhead because you need an extra 2 bytes to store the variable-length columns, and 2 bytes for each variable-length column in the variable-length column-offset array. Be aware of these storage costs when designing indexes and calculating how many index records are placed on an index page.
How SQL Server stores indexes on a variable-length column