Index fragmentation:
- Internal fragmentation (or leaf filling rate): reflects the space usage or idle rate of the Data leaf.
- External fragmentation: Since sqlserver uses eight consecutive pages as a database block (Zone) extent as the read unit, the physical storage zone and logical (discontinuous) io read Switching
- Logical fragmentation: This is the percentage of error pages on the leaf-level pages of the index. For error pages, the next physical page allocated to the index is not pointed to by the "next page" pointer in the current leaf level page
- Partition fragmentation: This is the percentage of error zones on the heap leaf page. Error zone refers to the zone where the current page containing the heap is not physically the next zone after the previous page. (Microsoft does not really explain the concept :(
Query shards:
- DBCC showcontig: Four object names: [index name] | [index id]
- DBCC showcontig: ID of the current database object, [index name] | [index id]
- SYS. dm_db_index_physical_stats: Database ID, Object ID, index ID, partition ID, scan mode
-
-
- Five parameters, basically, [0 (Special, index can be 0, so this is-1)] | [null] | [Default] The meaning is the same
Basic metrics:
- Scan density (%) [optimal count: actual count]: This is the ratio of "optimal count" to "actual count. If all content is continuous, the value is 100. If the value is smaller than 100, there are some fragments. "Optimal count" refers to the ideal number of zone changes when everything is continuously linked. "Actual count" refers to the actual number of changes in the partition.
- Logical scan fragmentation (%): percentage of error pages returned when the index's leaf-level page is scanned. This number is irrelevant to the heap. For error pages, the next physical page allocated to the index is not pointed to by the "next page" pointer on the current leaf page.
- Partition scan fragment (%): percentage of error areas when scanning index leaf-level pages. This number is irrelevant to the heap. For the error area, the area that contains the current index page is physically not the next area of the area that contains the previous index page. Note: This number is meaningless if the index spans multiple files.
- Avg_page_space_used_in_percent: Average page space usage. Related Concepts: Page splitting and page filling rate
- Avg_fragment_size_in_pages: the average number of pages has one shard. The larger the value, the better.
- Avg_fragmentation_in_percent: fragment rate, which is not explained. The smaller the value, the better. It is inversely proportional to avg_fragment_size_in_pages!
- Page_count: Total number of pages scanned
- Record_count: Total number of scanned records. Note: It is the number of records relative to the current scan, not necessarily a row of data in your user table.
- Forwarded_record_count: number of records split by PAGE
Scan Method
Because the index and heap are essentially B data structures, the number of B data is hierarchical, so you can choose multiple scanning methods: Non-page level? Or is it just a generation of samples? Or full scan?
The function Execution Mode determines the scanning level to obtain the statistical data used by the function. Mode is specified as limited, sampled, or detailed. This function traverses the page chains of allocation units, which constitute the specified partitions of a table or index. SYS. dm_db_index_physical_stats only requires an intention to share (is) Table lock and ignores the mode in which it runs. For more information about locking, see lock mode. The limited mode runs the fastest, and the number of scanned pages is the least. For an index, only the parent page of Tree B is scanned (that is, the page with a leaf level or above ). For the heap, only the associated PFS and Iam pages are checked, and the heap data pages are not scanned. In SQL Server 2005, all heap pages are scanned in limited mode. In limited mode, compressed_page_count is null because the database engine can only Scan non-leaf pages and heap Iam and PFS pages of Tree B. You can use the sampled mode to obtain the estimated value of compressed_page_count. You can use the detailed mode to obtain the actual value of compressed_page_count. The sampled mode returns statistics of 1% samples based on all pages in the index or heap. If the index or heap is less than 10,000 pages, use detailed mode instead of sampled. The detailed mode scans all pages and returns all statistics.From the limited mode to the detailed mode, the speed will be slower, because more and more tasks are executed in each mode. To quickly measure the size or fragment level of a scale or index, use the limited mode. It is the fastest and does not return the corresponding row for each non-leaf level in the in_row_data allocation unit of the index.
Best practices Always ensure that a valid ID is returned when db_id or object_id is used. For example, when using object_id, specify the names of three parts, such as object_id (n' adventureworks2008r2. person. address '), or in SYS. the dm_db_index_physical_stats function tests these values before using the values returned by the function. The following example A and B demonstrate a security method for specifying the database and Object ID. Detection fragmentation Fragments occur throughout the entire process of performing data modifications (insert, update, and delete statements) on the indexes defined in the table. Since these modifications are generally not evenly distributed in the rows of the table and index, the fill degree of each page changes with time. For queries that scan some or all of the table's indexes, such fragments can cause additional page reads. This delay Data Parallel scanning. Fragment computing in SQL Server 2008 Algorithm It is more accurate than SQL Server 2000. Therefore, the shard value is higher. For example, in SQL Server 2000, if the 11th and 13th pages of a table are in the same partition and the 12th page is not in this partition, the table is not considered to contain fragments. However, two physical I/O operations are required to access these pages. Therefore, in SQL Server 2008, this is counted as fragments. The index or heap fragment level is shown in avg_fragmentation_in_percent. Column. For a heap, this value indicates the partition fragment of the heap. For an index, this value indicates the logical fragmentation of the index. Unlike DBCC showcontig, in both cases, the fragment computing algorithm considers storage across multiple files, so the results are accurate. Logical Fragment This is the percentage of error pages in the leaf-level pages of the index. For error pages, the next physical page allocated to the index is not pointed to by the "next page" pointer on the current leaf page. Partition fragments This is the percentage of error zones on the heap leaf page. Error zone refers to the zone where the current page containing the heap is not physically the next zone after the previous page. For optimal performance, avg_fragmentation_in_percent should be as close as possible to zero. However, values from 0 to 10% are acceptable. All methods for reducing fragments (such as re-generation, re-organization, or re-creation) can be used to reduce these values. For details about how to analyze the fragmentation degree in an index, see reorganizing and recreating an index. Reduce fragmentation in the Index When index segmentation causes fragmentation to affect query performance, there are three ways to reduce fragmentation: 1. Delete and recreate the clustered index. Re-create a clustered index will re-distribute the data to fill up the data page. Fill level can be configured using the fillfactor option in create index. The disadvantage of this method is that the index is offline during the deletion and re-creation period, and the operation belongs to the atomic level. If the index creation is interrupted, the index cannot be created again. For more information, see create index (TRANSACT-SQL ). 2. Use alter index reorganize (instead of DBCC indexdefrag) Sort the index's leaf pages in a logical order. Because this is an online operation, indexes can still be used when the statement is run. When this operation is interrupted, completed tasks are not lost. The disadvantage of this method is that the re-organization of data is not as good as the index re-generation operation, and statistics are not updated. 3. Use alter index rebuild (instead of DBCC dbreindex) to generate an index online or offline . For more information, see alter index (TRANSACT-SQL ). You do not need to re-organize or re-generate indexes because of fragmentation. The primary impact of fragmentation is that pre-read throughput of pages is reduced during index scanning. This leads to a longer response time. If the query workload of a table or index that contains fragments does not involve scanning (because the workload is primarily a separate search), deleting fragments may not work. For more information, see this Microsoft website. Note: If the index is partially or completely moved during the contraction operation, running DBCC shrinkfile or DBCC shrinkdatabase may produce fragments. Therefore, if you must perform the contraction operation, you should not delete the fragments. Reduce fragments in the heap To reduce the partition fragmentation of the heap, create a clustered index for the table and then delete the index. Data will be re-distributed when a clustered index is created. At the same time, we will consider the distribution of available space in the database to optimize it as much as possible. When a clustered index is deleted to create a new heap, the data is not moved and the best position is maintained. For information about how to perform these operations, see create index and drop index. Compressing large object data By default, the alter index reorganize statement compresses pages that contain large objects (LOB) data. Because empty lob pages are not released, compressing this data can improve disk space usage when deleting a large number of Lob data or lob columns. Reorganizing the specified clustered index compresses all the lob columns contained in the clustered index. Reorganizing non-clustered indexes will compress all lob columns that are used as non-key (included) columns in the index. If all is specified in the statement, all indexes associated with the specified table or view are reorganized. In addition, all lob columns associated with clustered indexes, basic tables, or non-clustered indexes containing columns are compressed. Evaluate disk space usage Avg_page_space_used_in_percent column indicates page Filling . To optimize disk usage, this value should be close to 100% for indexes that are not randomly inserted. However, for indexes with a lot of random inserts and full pages, the page splitting score will continue to increase. This will lead to more fragments. Therefore, to reduce page splitting, This value must be less than 100%. You can use the specified fillfactor option to re-generate an index to change the page fill level to conform to the query mode in the index. For more information about fill factor, see fill factor. In addition, alter index reorganize tries to compress the index by filling the page to the last specified fillfactor. This increases the value of avg_space_used_in_percent. Note that alter index reorganize does not reduce page filling. Instead, the index must be re-generated. Evaluate index fragmentation Fragments are composed of physical continuous leaf-level pages in the same file in the allocation unit. An index must contain at least one shard. The maximum number of fragments that an index can contain is equal to the number of pages on the index. The larger the fragmentation, the less disk I/O required to read the same page number. Therefore, avg_fragment_size_in_pages The larger the value, the better the range scanning performance. Avg_fragment_size_in_pages is inversely proportional to avg_fragmentation_in_percent. Therefore, rebuilding or reorganizing indexes reduces the number of fragments, but increases the size of the fragments.