[Sorting]-Index Fragmentation

Source: Internet
Author: User
Pages & Extents (page and extended Disk Area)
The most basic data storage unit of SQL Server 2000 is data page, an 8 K storage space. When a bucket is allocated, SQL Server 2000 does not allocate one page at a time. The basic bucket allocation unit is the continuous space of eight pages, which is called extent.
For more information about the page, extents, and index structures of SQL Server 2000, see MSDN-Pages and Extents, MSDN-Table and Index Architecture.
 
Page Split)
During Insert/Update, if the page to be updated is full and cannot accommodate new data, SQL Server splits half of the page data, re-allocate a page for storage, and then perform the Insert/Update operation. The operation that splits a full data page into two data pages is called page split.
Whether it is data page or index page, page split occurs. During the Insert operation, if the free space on the page is smaller than the size of the record to be inserted, the page is split. During the Update operation, if the table contains a widening field, if the length of the width field increases, the free space on the original page is insufficient and the page is split.
 
Index Fragmentation (Index Fragmentation)
The index fragmentation of SQL Server has two types: external fragmentation and internal fragmentation.
External fragmentation:
When the logical order of the Index page is discontinuous, it is called external fragmentation. When an Index is created, the storage of the index page is logically continuous. During the insert operation, you may need to insert the new index between two indexes. If there is enough space for the index page at the index Insertion Location, a new index value will be inserted directly in this index page; if the space on this index page is full or there is not enough space required for the new index value, SQL Server splits the page and removes part of the index page data at the inserted position, to release the output space to insert a new index. The removed data is reallocated to the new page for storage elsewhere. In this way, as the insert operation increases, the logic continuity of the index page will decrease.
After the example index is created, the index structure is logically continuous:

If a new index with an index value of 2 needs to be inserted at this time, the inserted index structure is as follows:

After the index page is inserted, the structure of the index page is logically discontinuous.
External fragmentation does not significantly affect the query performance when a specific record is returned through index or when no sorting is required. When you need to return the specified sorting record set, you need to perform additional processing on the non-consecutive index pages in the logic during the sorting process. For tables with large data volumes, if there are many index pages, external fragmentation is very serious and requires high query costs. In addition, external fragmentation affects the cache efficiency.
External fragmentation uses two metrics to describe page continuity and extent continuity.
Internal fragmentation:
If the storage space in the Index page does not reach the maximum storage capacity, it is called internal fragmentation. Regardless of the fill factor, the index structure is logically continuous when the index is created, and each index page is fully stored and fully utilized. The Delete operation causes the index page to be idle. In the External fragmentation example, the page split during the insert operation also causes the index page to be idle.
Serious internal fragmentation causes the index page to occupy much larger storage space than actually needed. Logical READS, I/O, and other operations are added during index scan in the query, resulting in performance problems.
Internal fragmentation uses the average page idleness/utilization as an indicator.
Fill factor:
If a fill factor value is set or automatically maintained by SQL Server, each index page is not fully stored during index creation, and some free space is reserved Based on the fill factor value. In the external fragmentation example, if the first index page is not fully stored, you do not need to split the index page when inserting a new index with an index value of 2, thus, the insert operation can be improved.
Fill factor is used in tables that require frequent insert/update operations to avoid the emergence of a large number of page splits. Obviously, the use of fill factor is similar to internal fragmentation, but it brings great improvements to a large number of insert operations and various data page structures. For fill factor, it is not easy to accurately assess the value of which is the best. Microsoft recommends that SQL Server be automatically maintained. Improper fill factor settings affect SQL Server performance, just like internal fragmentation.
 
DBCC SHOWCONFIG
Displays data and index fragmentation information.
Dbcc showcontig (TblUserItem, PK_TblUserItem)
Displays the fragmentation information of the PK_TblUserItem index in the TblUserItem table.
Dbcc showcontig (TblUserItem) WITH ALL_INDEXES
Displays the fragmentation information of all indexes of the table TblUserItem.
Dbcc showcontig with ALL_INDEXES
Displays the fragmentation information of all indexes in the current database.
The execution result is as follows:
Dbcc showcontig scanning 'tbluseritem' table...
Table: 'tbluseritem' (1077578877); index ID: 1, database ID: 8
TABLE level scan completed MED.
-Pages Scanned ......
-Extents Scanned ......: 7563
-Extent Switches...
-Avg. Pages per Extent...
-Scan Density [Best Count: Actual Count] ......: 94.07% [7117: 7566]
-Logical Scan Fragmentation ......: 0.00%
-Extent Scan Fragmentation ..........: 0.03%
-Avg. Bytes Free per Page ......: 114.3
-Avg. Page Density (full) ......: 98.59%
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
Page Scanned: Number of pages actually Scanned. The actual number of data pages can be roughly calculated from the size of each data row and the total number of rows. If the number of page Scanned far exceeds the actual number of data pages calculated, the internal fragmentation is serious.
Extents Scanned: the ideal value is to round Page Scanned/8 to the smallest integer. If Extents Scanned is greater than the theoretical value, there is a certain degree of external fragmentation.
Extent Switches: the ideal value is Extents Scanned minus 1. exceeding this value indicates there is external fragmentation.
Avg. Pages per Extent: the ideal value is 8. If the value is smaller than 8, external fragmentation exists.
Scan Density [Best Count: Actual Count]: This is the most meaningful value returned by dbcc showcontig. The ideal number of extents is one of the important statistics of external fragmentation compared to the Actual number of extents. The ideal value is 100%, which cannot be lower than 60%.
Logical Scan Fragmentation: another very meaningful value that indicates the degree of page continuity and reflects one of the important statistical information of external fragmentation. It should be between 0% and 10%, and cannot exceed 15%.
Extent Scan Fragmentation: indicates the degree of non-continuity of extents, the ideal value is 0%.
Avg. Bytes Free per Page: Average number of idle Bytes per page. An excessively high value indicates internal fragmentation, but the fill factor must be excluded.
Avg. Page Density (full): A percentage parameter opposite to Avg. Bytes Free per Page. A lower value indicates internal fragmentation.
In addition, dbcc showcontig has several optional parameters. For more information, see Online Help.
 
Resolving fragmentation issues
1. Drop the original index and recreate these indexes.
In this process, indexes are dropped and rebuild, which blocks all queries during this period. This method is used for all clustered indexes and non-clustered indexes, this may cause non-clustered index to be rebuilt twice.
The advantage is that the index is completely rebuilt to achieve the ideal situation. This method should be used if external and internal fragmentation are both serious.
2. Use the DROP_EXISTING clause
The DROP_EXISTING clause can be used to prevent the non-clustered index from being rebuilt twice.
3. DBCC DBREINDEX
You can specify only the Table name without specifying the INDEX name. This command automatically recreates all the indexes of the Table, which is easier than writing multiple drop index and create index statements. This command re-creates the primary key and UNIQUE constraints and STATISTICS of the Table at the same time without any additional operations on these constraints and STATISTICS.
Dbcc dbreindex can fully use multiple CPUs for processing. It is faster for table operations with large data volumes and extremely severe fragmentation.
This method completes operations in a transaction and requires sufficient free space in the data file to reconstruct all indexes and related objects. Otherwise, the operation may fail, or the reconstruction is incomplete. For example, after the reconstruction, the logical fragmentation may still be greater than 0. For tables with a large amount of data, more free space is required. Pay special attention to this.
4. DBCC INDEXDEFRAG
Dbcc indexdefrag is performed in two steps. First, each index page is compressed to release redundant pages. Then, the root nodes of the index page are reorganized, make the logical order of the index page consistent with that of the physical storage, that is, ensure that the logical order is continuous in the physical storage direction.
In the four methods, the other three must be performed in the offline database, because during the operation, all queries using these indexes will be blocked. Dbcc indexdefrag can be executed when the database is online, but the sorting is incomplete. On the one hand, the index page to be locked will be ignored during execution, and on the other hand, it will not assign a new page for re-sorting, but will only reorganize it in the originally allocated page space. The purpose of this command is to make the logical order of the index page consistent with that of the physical storage. If there is an interval between the physical storage of two adjacent pages or extent logically, dbcc indexdefrag does not take actions to eliminate this physical storage interval. Therefore, if the space occupied by the index page is very large, accessing the index page space may increase the head locating and moving overhead, thus increasing the I/O operation cost to a certain extent.
 
Expert suggestion: dbcc dbreindex is used for 60% <Scan Density <75%, 10 <Logical fragmentation <15; dbcc indexdefrag is used for Scan Density <60%, Logical fragmentation> 15.
 
In addition, you can consider the following aspects in table Design:
1. For tables with frequent Insert/Update operations, select an appropriate fill factor.
2. design the widening field as an equal width field. For example, material number, order number, Customer Code, and supplier code.
3. Set the delete tag instead of deleting data physically.
For example, in some large systems, an object may have dozens of fields. Generally, one table is used to accommodate all these fields, but two tables can be used for storage for system access efficiency. The primary table stores key and most frequently accessed field attributes. Do not use widening fields as much as possible. The table stores additional descriptive and infrequently accessed field attributes. In this way, although it is a one-to-one association table, in most cases, the query that only needs to access the attributes of the primary table can greatly improve the access performance.
 
Reference: Randy Dyess-SQL Server Index Fragmentation and Its Resolution.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.