SQL Server big Talk storage structure (1)

Source: Internet
Author: User
Tags create index mixed readable

Reading directory (content)

    • 1 Types of data pages
      • 1.1 PFS
        • 1.2.3 IAM
    • 2 Data page structure
      • 2.1 Page Header
      • 2.2 Rows of records
      • 2.3 Free Space
      • 2.4 Line Offset
    • 3 Ways to query data page storage formats
      • 3.1 DBCC IND
        • 3.1.1 Syntax description
        • 3.1.2 Test Cases
      • 3.2 DBCC page
        • 3.2.1 Syntax Description
        • 3.2.2 Test Cases
If reproduced, please indicate the source of the blog: www.cnblogs.com/xinysu/, copyright belongs to the blog Garden su Home radish all.    Hope you support! The data page size of SQL Server is 8kb,8 a contiguous physical page into a zone.    The mixed zone and the unified zone can be used to store the contents of different database objects, usually the database objects are small; the unified zone represents the contents of the same database object that are stored in contiguous 8 data pages. The disk read-write for SQL Server is page-level and the smallest unit of data per I/O operation is a page in 8kb.    The page is a storage unit, and the zone is the administrative unit. Data pages are divided into different types according to their storage content, which is briefly described in this paper. Top (go to top) 1 Types of data pagesDifferent types of data, stored in different types of pages, can be broadly divided into 3 directions: storage of the actual data, storage Management data page data and storage backup-related data, these 3 reverse, and each has a different page type. See the table below.
Id User_for Page Type Page Name Description
1 Actual data Data page Data Page Leaf nodes of a heap table or clustered index
2 Index page Index Page Branch node or nonclustered index for clustered index
3 Lob Lob For storing large object data types: text, image, varchar (max), varbinary (max), etc.
4 Row Overflow page Row Overflow Page Only single text or image column data blocks can be stored
5 Manage Data page data GAM page Global Allocation Map Managing bitmaps for the Unified zone
6 Sgam page Shared Global Allocation Map Managing bitmaps for mixed areas
7 Iam page Index Allocation Map The area to which the allocation unit is assigned
8 PFS Page Page Free Space Free space
9 Backup-related data DCM Page Differential Changed Map Information for the extents changed since the last backup DATABASE statement
10 BCM Page Bulk Changed Map Information for the extents modified by the bulk operation since the last BACKUP LOG statement
The page type for backup-related data is not described here in detail. The type of page that stores the actual data will be described in more detail later in the post. This article focuses on the type of page that stores data for managing data pages: Gam,sgam,iam,pfs.    This piece, the performance optimization of tempdb will be of great help. The beginning of each data file is distributed GAM,SGAM,PFS these pages, which record which pages in the database file are already in use and which pages are not used.

1.1 PFS PFS page, used to track page assignment levels and to store information about all page allocations and available space in the current data file, the 2nd data page for each data file is PFS, with page number 1. In this page, each a single byteDescribes whether each of the subsequent data pages can write records, that is, a PFS page is 8k, about 8k bytes can describe the subsequent use of each page, that is, a PFS page, can describe the use of 8k data pages, This means that a single PFS page can store approximately 64M of free space on a data page.    So, about every 64Mb, there will be a new PFS page. Each byte describes the use of a data page, a byte with 8 bits, respectively 第0-7位, for the following purposes:
    • Bit 0-2-bit, which describes how much free space the page has
      • 0x00 is empty
      • 0X01 is 1 to 50% full
      • 0X02 is Wuyi to 80% full
      • 0X03 is bayi to 95% full
      • 0X04 is ~ 100% full
    • Bit 3 (0x08): Is there a ghost record (Ghost records:http://www.cnblogs.com/lyhabc/archive/2013/06/16/3138214.html) in this data page?
    • Bit 4 (0x10): Is it an IAM page?
    • Bit 5 (0x20): Is it a mixed page?
    • Bit 6 (0x40): Is it allocated?
    • Bit 7 reserved, unused, with no actual meaning
1.2 GAM & SGAMThe area of SQL Server is divided into two types: mixed zone (uniform extent) and unified zone (mixed extent). Mixed zone, refers to the area of 8 consecutive data pages, respectively assigned to different database object storage use, which is conducive to the use of small tables in the process, not all of a sudden occupy a district, but first from the mixed area to use, reasonable allocation of space; a unified area, refers to the area of 8 consecutive pages are used to store the same data object, When a table is assigned 8 mixed area pages, it begins to allocate the use of a unified zone to avoid data being distributed across discrete pages of data. 1.2.1 GAM gam page, used to track the allocation of the area, to describe whether each area is assigned, each area with 1bit identification of its distribution。 A GAM page 8k, a total of 8k*8 bit, that is, almost can identify 64,000 zones, about 8k*8bit* (8*8k) =4g space, so, about every 4G space around, there is a GAM page to identify the next 4G of space allocation situation.
    • Bit=1, identifies that the current area is idle and can be used for allocation;
    • Bit=0, identifies the current zone and is used by the data.
SQL Server finds free space by reading the GAM page and assigns a page of this area or region to an object. A range that only stores one bit (instead of a PFS page, PFS is a page-by-byte) means that a single GAM page can track more space, and in one data file, you can find a new GAM page in approximately 4GB space. However, the page number of the first GAM page in the data file is always 2, so "2:1:2" represents the first GAM page in tempdb. 1.2.2 SGAM sgam pages, which are used to track the allocation of extents, describe which areas are mixed and have at least one free data page. 1bit describes a zone, where 1 indicates that the area is a mixed area and has at least one free data page, which can be assigned to objects that need to use a mixed area. SQL Server allocates space to small objects by reading the Sgam page to find a range mixed with the available space. A single Sgam page can track 4GB of space, so you can find them in 4GB intervals, just like the GAM page.    In a data file, the first SGAM page is Page 3, so "2:1:3" represents the first Sgam page of tempdb. So how do these two types of data pages assist the storage engine to manage the zone?
    • When the storage engine allocates a unified zone, look for a page labeled 1 in the GAM page, change the marker to the mark in the 0,sgam page, and keep it at 0;
    • When the storage engine allocates a mixed area, look for a page labeled 1 in the GAM page, change the marker to the mark bit in the 0,sgam page from 0 to 1;
    • When the storage engine looks for a mixed area with free pages, it looks directly at the data page for the tag bit 1 in the SGAM page; if it is not found, it is reassigned to a mixed area.
1.2.3 IAMThe allocation of the area, using GAM and sgam management; page usage, using the PFS page management. So, who can help manage the storage of each database object in each page or area?    The answer is an IAM page. In a table, an IAM page is used to describe the distribution of data, which is described based on the allocation unit. According to the actual data storage situation, divided into the following 3 allocation units:
    • In_row_data
      • The storage heap or index partition, which is heap and b-tree.
    • Lob_data
      • Stores large object (LOB) data types, such as XML, varbinary (max), and varchar (max).
    • Row_overflow_data
      • Stores variable-length data stored in a varchar, nvarchar, varbinary, or sql_variant column that exceeds the 8,060-byte row size limit.

Each table with data has at least one IAM page to manage the storage of the in_row_data, and if there is lob_data in the table, there will be one more IAM page to manage Lob_data,row_overflow_data.

Top (go to top) 2 Data page structureThe data page has 4 parts: header, row record, free space, and line offset. See: 2.1 Page Header

Takes 96 bytes to store the system data associated with the page.

The contents of the page header are as follows:

2.2 Rows of records
    • Store data row Records and index data
    • Row records can also be stored on separate pages, such as row overflow data or LOB data
2.3 Free Space
    • Drop header, row record, and offset remaining space, supplied to row record and row offset usage
2.4 Line Offset
    • The line offset is a small block of 2 bytes per small block, indicating that the data row starts recording after the first byte, that is, how many offsets from the page header start recording
    • Storage is stored from the left, with slots to describe, slot 0, slot 1 ....
    • What is the content of the row offset record? Which byte the row records from, in general, slot 1 begins after the 96th word
    • Often said clustered index storage order is physical sort, refers to not the row record physical ordering, but the row offset physical ordering, the data page, Row records are added in order, by modifying the row offset to achieve the order of the clustered index lookup
Top (go to top) 3 ways to query data page storage formatsSee how data pages are stored here are two ways to do this: DBCC PAGE view and DBCC IND view, these two directives are non-public directives, so the appropriate usage instructions are not found on MSDN. However, it is possible to access the analysis through the relevant information of the technical insider. These two instructions can view their parameter descriptions after you open trace flag 2588, and after you turn on trace flag 3604, you can put the result of the instruction execution back to the client instead of logging in the error log. 3.1 DBCC IND 3.1.1 Syntax DescriptionView the IND parameter description and open the trace flag 2588,help query.
DBCC TRACEON (2588) DBCC HELP (' IND ') DBCC execution is complete. If DBCC outputs an error message, contact your system administrator. DBCC IND ({' dbname ' | dbid}, {' ObjName ' | objid}, {nonclustered indid | 1 | 0 |-1 |-2} [, Partition_number]) DBC C is finished executing. If DBCC outputs an error message, contact your system administrator.
The format of the output has 4 ways, different ways, the output is not the same.
    • -2: Returns all IAM pages, based on the management of in-row data pages, row overflow data pages, and IAM pages for large object data pages
    • -1: Returns all IAM pages and data pages.
    • 0: Returns an IAM page that manages data pages within rows, in-row data pages
    • 1: Returns data page information for clustered index and IAM page information (same-1)
    • 2: Returns data page information and IAM page information for the 1th nonclustered index
    • 3: Returns data page information and IAM page information for the 2nd nonclustered index
    • ...
    • N: Returns data page information and IAM page information (n>1) for the n-1 nonclustered index

3.1.2 Test CasesNew table, Tbpage_c, contains large data objects and row overflow situations. Insert a row of data, and then analyze. In the database, each row of data is stored by default on the same data page, if a data page can not store a row of data, then there will be row overflow situation, specifically to understand the row overflow, so in this table, set two data columns occupy space >8k, see table SQL as follows, the SQL to build data is as follows.
CREATE TABLE Tbpage_c (ID int identity () NOT null primary key, Namea varchar (6000), Nameb varchar (+), descriptions text ) #name_a insert 6,000 characters, name_b insert 3,000 characters, descriptions insert 100 characters insert into Tbpage_c (namea,nameb,descriptions) Select      substring (Stuff ((select name+ ', ' from Master.dbo.spt_values for XML Path ("))," ("),"), 1,6000),      SUBSTRING (Stuff ((select name+ ', ' from Master.dbo.spt_values for XML Path (')), "("), "), 1,3000),      substring (stuff ( Select Name+ ', ' from Master.dbo.spt_values for XML Path (') '), 1, 1, '), 1,100)

DBCC IND (' dbpage ', ' Tbpage_c ',-2)

The option is-2, which displays all IAM pages for the table. Because the table has row overflow and large object column, so there will be its corresponding IAM page, so you can see there are 3 iam, in-row data, row-overflow data, LOB data.

The DBCC IND (' dbpage ', ' Tbpage_c ',-1) option is-1, which returns all IAM pages and data pages.
    • Data page number 310,309 belongs to the In-row data type. 309 record actual data, 310 record In-row data actual page distribution.
    • Data page number 307,308 belongs to the Row-overflow data type. 307 record actual data, 308 record Row-overflow data actual page distribution.
    • The data page number 305,306 belongs to the LOB data type. 305 record actual data, 306 record the actual data page distribution of LOB data.

The DBCC IND (' dbpage ', ' Tbpage_c ', 0) option is 0, which returns the IAM page that manages the data pages in the row, the inline data page, and therefore only returns the In-row data page.

The DBCC IND (' dbpage ', ' Tbpage_c ', 1) option is 1, and returning a clustered index involves all IAM pages and data pages. 3.2 DBCC PAGE 3.2.1 Syntax DescriptionView the page's parameter description and open the trace flag 2588,help query.
DBCC TRACEON (2588) DBCC help (' PAGE ') DBCC executes. If DBCC outputs an error message, contact your system administrator. DBCC PAGE ({' dbname ' | dbid}, FileNum, Pagenum [, printopt={0|1|2|3}]) DBCC execution is complete. If DBCC outputs an error message, contact your system administrator
The format of the output has 4 ways, different ways, the output is not the same.
    • 0: Output data page header data in readable form
    • 1: Output The data page header data in readable form, and also have slots corresponding to the hexadecimal contents of the record
    • 2: Output The data page header data in readable form, output the hexadecimal data of the entire page header, and display the contents of the whole page, including unused space.
    • 3: Output A readable form of data page header data, and include the record in the readable form of each field, row overflow data will also display the data content, but the big object does not display the content, but the storage location! So option 3 is the most comprehensive of the output content.
3.2.2 Test CasesUsing the 3.1.2 table, analyze its In_row data page, page_id=309. This test does not use with TABLERESULTS analysis if DBCC page (' Dbpage ', 1,309,0) With tableresults, the results are returned in tabular form and can be used for post-management analysis. The option is 0 to output data page header data in a readable format.
DBCC TRACEON (3604)
DBCC page (' Dbpage ', 1,309,0)------------------------------------------------------------------------------------ ----------------------PAGE: (1:309) buffer:buf @0x000000027c0827c0 bpage = 0x000000026fa86000 Bhash = 0x0000000 000000000 Bpageno = (1:309) Bdbid = Ten Breferences = 0 bcputicks = 0b Samplecount = 0 BUse1 = 46781 bstat = 0xbblog = 0x212121cc Bnex t = 0x0000000000000000 PAGE header:page @0x000000026fa86000 M_pageid = (1:309) m_headerversion = 1 M_type = 1/* M_pageid current page number; m_headerversion version number, always 1;m_type page data type, 1 for data page, 10 for IAM page, etc., refer to Pagetype*/m_typeflagbi ts = 0x0 M_level = 0 m_flagbits = 0xc000/*m_typeflagbits data page and index page is 4, other page is 0m_level the page is in Progression in index page (b-Tree), 0 for leaf node m_flagbits page flag */m_objid (allocunitid.idobj) = M_indexid (Allocunitid.idind) = 256/*m_indexid (A Llocunitid.idind) index id,0 represents the heap, 1 generationsTable clustered index, 2-250 means that a nonclustered index greater than 250 is the text or image field */metadata:allocunitid = 72057594040221696metadata:partitionid = 72057594038976512 Metadata:indexid = 1metadata:objectid = 341576255 M_prevpage = (0: 0) M_nextpage = (0:0)/*metadata:allocunitid storage unit Id,sys.allocation_units.allocation_unit_idmetadata:par Titionid the partition number where the data page resides, sys.partitions.partition_idMetadata:ObjectId the id,sys.objects.object_idmetadata of the object to which the page belongs: IndexID sys.objects.object_id&sys.indexes.index_idm_prevpage The previous page of the data page m_nextpage the next page of the data page */pminlen = 8 m_slotcnt = 1 m_freecnt = 5035m_freedata = 3155 m_reservedcnt = 0 M_LSN = (39:400:68) m_xactreserved = 0 M_xdesid = (0:0) m_ghostreccnt = 0m_ tornbits = 0 DB Frag ID = 1/*pminlen The number of bytes for the data in bytes m_slotcnt The number of rows of data in the page m_freecnt the amount of space remaining in the page, and how many bytes remain M_freedata the starting position of the page free space, a page of 8KB approximately equal to 8192 bytes of page free space in the position of 3155m_reservedcnt the number of bytes freed by the active transaction M_LSN the log record number m_xactreserved The latest number of bytes added to the m_reservedcnt realm M_xdesid the most recent transaction m_reservedcnt add to Idm_ GHOSTRECCNT the number of rows in the Phantom data m_tornbits the check digit of the page or is determined by the form of the Database page protection Lsnsql server that replaces the database page maintains a hash table in memory, Record the most recent LSN (Log Sequence number) value for all the pages that have written actions. The next time you read the page, you will compare whether the two values are equal. Because the LSN is a unique value that grows automatically, the LSN value will be larger than the original one for each newly modified page. So if the read LSN is inconsistent with the memory, it means that the last write request was not actually completed. The 824 error will also be triggered. */Allocation Status GAM (1:2) = Allocated SGAM (1:3) = ALLOCATEDPFS (1:1) = 0x60 Mixed_ext Allocated 0_pc T_full DIFF (1:6) = CHANGEDML (1:7) = Not min_logged

The option is 1, which outputs the data page header data in a readable form, and also has the hexadecimal contents of the slot corresponding to the record. Page header data has been analyzed above, not done, here describes the slot corresponding record
DBCC page (' Dbpage ', 1,309,1)------------------------------------------------------------------------------------  ----------------------page header information omitted ... Slot 0, Offset 0x60, length 3059, Dumpstyle byte/*slot slot number, one slot row of data, this row of data starting from 0x60 = 96, Length is 3059 bytes The 16 binary contents of the row record */recor D Type = Primary_record RECORD Attributes = null_bitmap Variable_columnsrecord Size = 3059 Memory Dump @0x000000001f978060/* Below records 3,059 bytes of content for this line */0000000000000000:30000800 01000000 0400a003 002b80e3 0bf38b02 0 ... ....+...... 0000000000000014:00000001 000000d5 69000070 17000033 01000001 ..... P... 3....0000000000000028:00000028 72707429 2c594553 204f5220 4e4f2c53 ... (RPT), YES OR no,s000000000000003c:59535245 4d4f5445 4c4f4749 4e532054 59504553 ysremotelogins types Middle omitted ... 0000000000000bcc:7072696d 61727920 6b65792c 616e7369 5f6e756c primary key,ansi_nul0000000000000be0:6c5f6400 00d1070 0 00000031 01000001 000100 l_d ...              1 ... OFFSET table:row-offset         0 (0x0)-(0x60)      

The option is 2, which outputs the hexadecimal data for the entire page header, displaying the contents of the whole page, including unused space.
DBCC page (' Dbpage ', 1,309,2)------------------------------------------------------------------------------------ ----------------------Page header information is omitted .../* The following is a full page of data storage, including Row records with free space, not differentiated slots */data:memory Dump @0x0000000028178000 0000000028178000:01010000 00c00001 00000000 00000800 00000000 ......... ..... 0000000028178014:00000100 23000000 ab13530c 35010000 01000000 ..... S.5 ..... 0000000028178028:27 million 90010000 44000000 00000000 00000000 ' ..... D........... 000000002817803c:00000000 01000000 00000000 00000000 00000000 ............ 0000000028178050:00000000 00000000 00000000 00000000 30000800 ......... 0...0000000028178064:01000000 0400a003 002b80e3 0bf38b02 00000001 ............ 0000000028178078:000000d5 69000070 17000033 01000001 00000028 ..... P... 3 ... (000000002817808c:72707429 2c594553 204f5220 4e4f2c53 59535245 rpt), YES OR no,sysre00000000281780a0:4d4f544 5 4c4f4749 4e532054 59504553 2c535953 motelogins types,sys omitted ... 0000000028178c1c:65726963 20726f75 6e646162 6f72742c 7072696d Eric roundabort,prim0000000028178c30:61727920 6b65792c 6 16e7369 5f6e756c 6c5f6400 ary key,ansi_null_d.0000000028178c44:00d10700 00000031 01000001 00010000 00212121 ... 1. ...!!! 0000000028178c58:21212121 21212121 21212121 21212121 21212121!!!!!!!!!!!!!!!!!!!! 0000000028178c6c:21212121 21212121 21212121 21212121 21212121!!!!!!!!!!!!!!!!!!!! Omit in ... 0000000028179fe0:21212121 21212121 21212121 21212121 21212121!!!!!!!!!!!!!!!!!!!! 0000000028179ff4:21212121 21212121 21216000!!!!!!!!!! `. OFFSET table:row-offset0 (0x0)-0x60

The option is 3, outputting the data page header data in a readable form, and includes the readable form of each field in the record, and the row overflow data will also display the data content, but the large object does not display the content, but the storage location!

DBCC page (' Dbpage ', 1,309,3)------------------------------------------------------------------------------------ -------------page header information omitted ... Slot 0 Offset 0x60 length 3059/*slot slot number, one slot row of data, starting from 0x60 = 96, length is 3059 bytes*/record Type = Primary_record R Ecord Attributes = null_bitmap Variable_columnsrecord Size = 3059Memory Dump @0x000000002bb78060/* below for this line record 3,059 bytes of content */00 00000000000000:30000800 01000000 0400a003 002b80e3 0bf38b02 0............+ ... 0000000000000014:00000001 000000d5 69000070 17000033 01000001 ..... P... 3....0000000000000028:00000028 72707429 2c594553 204f5220 4e4f2c53 ...   (RPT), YES OR no,s000000000000003c:59535245 4d4f5445 4c4f4749 4e532054 59504553 ysremotelogins TYPES0000000000000050:  2c535953 52454d4f 54454c4f 47494e53 20545950, sysremotelogins typ0000000000000064:45532028 55504441 5445292c 41463a20 61676772 ES (UPDATE), af:aggr0000000000000078:65676174 65206675 6e637469 6f6e2c41 503a2061 egate function,ap:a Middle omitted. .. 0000000000000bb8:2c6e756d 65726963 20726f75 6e646162 6f72742c, numeric roundabort,0000000000000bcc:7072696d 61727920 6b65792c 6 16e7369 5f6e756c primary key,ansi_nul0000000000000be0:6c5f6400 00d10700 00000031 01000001 000100 l_d ..... 1 .../* Below is a row record in slot 0 slot 0 detailing the storage of each column */slot 0 column 1 Offset 0x4 Length 4 Length (physical) 4/*slot 0, the first column on this page occupies With 4 bytes, the column name is ID, the value is 1*/id = 1 Namea = [BLOB Inline Root] Slot 0 column 2 Offset 0x13 length Length (physical) 24/*slot 0, 2nd column, this                           The page occupies 24 bytes, the column name is Namea here is a row overflow condition, there is no data in the column, but the actual location where the column is stored is 6000 bytes, and the value is stored on the slot 0 slot on page No. 307 of the first file */level = 0                         Unused = 0 Updateseq = 1TimeStamp = 1775566848 Type = 2Link 0 Size = 6000  RowId = (1:307:0) Slot 0 Column 3 Offset 0x2b length Length (physical) Nameb = (rpt), YES OR No,sysremotelogins types,sysremotelogins TYPES (UPDATE), af:aggregate function,ap:application,c:check CNS,... Omit in ... primary key,ansi_null_d/*slot 0, 3rdcolumn, this page occupies 3000 bytes, the column name is NAMEB, the value is nameb= after a large segment */descriptions = [Textpointer] Slot 0 Column 4 Offset 0xbe3 length Length (ph           ysical) 16/*slot 0, column 4th, which is the text data type, this page occupies 16 bytes, the column name is descriptions, and its value is stored on the first file on page No. 305 of slot 1 slot */Texttimestamp = 131137536 RowId = (1:305:1) Slot 0 Offset 0x0 Length 0 Length (physical) 0/* The table has a primary key, the row's keyhashvalue value */Keyhashvalue = (819444 3284A0)

option is 3, and a special case is that when you parse a nonclustered index, it returns one more table
CREATE TABLE tbpage (ID int primary key NOT NULL identity (max), cola int,colb varchar (), COLC varchar) INSERT INTO TB Page (COLA,COLB,COLC) Select Object_id,type,name from sys.objects CREATE index Ix_colc on tbpage (COLC) SELECT * FROM sys.in dexes where name= ' IX_COLC ' DBCC IND (' dbpage ', ' tbpage ',-1)

The DBCC page (' Dbpage ', 1,319, 3) returns a more readable list, detailing the column conditions and key values for the index, while the original message content remains.


Reference post: http://www.sqlservercentral.com/blogs/practicalsqldba/2012/10/10/ sql-serverunderstanding-the-page-free-space-pfs-page/http://www.sqlservercentral.com/blogs/practicalsqldba/ 2013/03/07/sql-server-understanding-the-iam-page/reference book: SQL Server 2012 Implementation and combat Management Guide
SQL Server Performance tuning combat, SQL Server 2005 Technology Insider storage Engine

SQL Server big Talk storage structure (1)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.