SQL Server Optimization-index fragmentation

Source: Internet
Author: User
ArticleDirectory
    • Clustered Indexes

First, we will introduce some index fragmentation related concepts and check and organize the Methods:

Two types of index fragmentation

One external Fragment

The cause is that the index is not arranged in the logical order.

For example, the order of index page allocation is

Page 1 page 2 page 3

Data: 2 4 6 8 10 12 14 16 18 20 22 24

When we insert new data, for example, 5, the system may allocate the data and generate a new index page.

Page 1 page 2 Page 3 page 4

Data: 2 4 5 10 12 14 16 18 20 22 24 6 8

In this case, if we want to query 4-10 data, we need an additional page to return both 6 and 8 data.

2. Internal index fragmentation

The cause is that the index page does not make full use of the allocated space, and the internal index fragmentation will increase the index space.

3. We can use DBCC showcontig to check index fragmentation.

Usage:

DBCC showcontig-for detailed usage, see sqlserver online documentation

Example:

Declare @ table_id int
Set @ table_id = object_id ('ta066 ')
DBCC showcontig (@ table_id)

Four-fragment sorting (re-indexing)

The scanning fragmentation of the logical scan and the extended disk area is very large, and the index fragmentation needs to be processed.

There are two solutions: one is to use DBCC indexdefrag to sort out index fragmentation, and the other is to use DBCC dbreindex to reconstruct the index. Both have their own advantages and disadvantages. Microsoft documentation explanation:

The DBCC indexdefrag command is an online operation, so the index is only available when the command is running. In addition, the operation can be interrupted without losing the completed work. The disadvantage of this method is that there is no clustered index removal/re-creation operation in reorganizing data.

Re-create a clustered index will re-organize the data. The result is that the data page is filled up. You can use the fillfactor option to configure the fill level. The disadvantage of this method is that the index is offline during the deletion/re-creation period and the operation belongs to the atomic level. If the index creation is interrupted, the index will not be re-created.

That is to say, to achieve good results, you still need to re-build the index, so you have to re-build the index.
DBCC dbreindex (table, index name, fill factor)
The first parameter can be the table name or table ID.
The second parameter, if it is '', affects all indexes of the table.
The third parameter is the fill factor, that is, the data fill degree on the index page. If it is 100, it means that every index page is filled up. At this time, the Select efficiency is the highest, but when you want to insert an index in the future, you have to move the limit ?, Indicates that the previous fill factor value is used.

DBCC dbreindex (A, '', 90)

5. Case study:

Recently, it took six seconds to adjust an SQL statement. It was found that there were many index fragments and indexes were rebuilt.

Ddeclare @ table_id int
Set @ table_id = object_id ('tc005 ')
DBCC showcontig (@ table_id)

DBCC showcontig scanning 'tc005 'table...
Table: 'tc00' (1453352342); index ID: 1, Database ID: 8
Table level scan completed MED.
-Pages scanned ......
-Extents scanned ......: 12031
-Extent switches...
-Avg. pages per extent...
-Scan density [best count: actual count] ......: 12.58% [7971: 63372] -- it is better to set the value to close to 100%.
-Logical scan fragmentation ......: 50.35% -- the value is close to 0.
-Extent scan fragmentation ......: 92.39% -- the value is close to 0.
-Avg. bytes free per page ......: 6048.2
-Avg. Page density (full) ......: 25.28%
DBCC execution completed. If DBCC printed error messages, contact your system administrator.

DBCC dbreindex ('newjoyo2. DBO. tc005 ', '', 90)

Check again:

Declare @ table_id int
Set @ table_id = object_id ('tc005 ')
DBCC showcontig (@ table_id)

DBCC showcontig scanning 'tc005 'table...
Table: 'tc00' (1453352342); index ID: 1, Database ID: 8
Table level scan completed MED.
-Pages scanned ......
-Extents scanned ......: 2232
-Extent switches...
-Avg. pages per extent...
-Scan density [best count: actual count] ......: 99.01% [2210:2232] -- close to 100%
-Logical scan fragmentation ......: 0.04%
-Extent scan fragmentation ..........: 0.40%
-Avg. bytes free per page ......: 707.6
-Avg. Page density (full) ......: 91.26%
DBCC execution completed. If DBCC printed error messages, contact your system administrator.

Trace again and find that the CPU time is basically within 100 milliseconds:

The sample data is as follows:

Tracestart CPU reads writes duration client_processid spid 2007-10-10 07:40:30. 267
SQL: batchcompleted Internet Information Services joyo2 62 21474 0 63 3784 286 173 07:41:07.
SQL: batchcompleted Internet Information Services joyo2 62 21474 0 60 7496 233 07:41:10. 910
SQL: batchcompleted Internet Information Services joyo2 46 478 0 46 4992 258 847 07:41:19.
SQL: batchcompleted Internet Information Services joyo2 47 959 0 46 7496 233 07:41:21. 800
SQL: batchcompleted Internet Information Services joyo2 0 128 0 3784 286 503 07:41:42.
SQL: batchcompleted Internet Information Services joyo2 0 176 0 7496 233 313 07:42:03.
SQL: batchcompleted Internet Information Services joyo2 0 525 0 3784 286 597 07:42:20.
SQL: batchcompleted Internet Information Services joyo2 47 3264 0 60 4992 222 07:42:21. 723
SQL: batchcompleted Internet Information Services joyo2 79 7871 0 76 3784 286 783 07:42:47.

The results are still unsatisfactory. Finally, check the SQL statements and find that the clustered index is not used for Association of two tens of millions of tables. This SQL statement can no longer be optimized, and the entire SQL statement is mostly consumed on bookmark lookup.

Conclusion:

The reason for the planned bookmark lookup step is that the clustered index is not used and cannot be avoided.

There are two types of sqlserver indexes: clustered indexes and non-clustered indexes: clustered indexes and tables are integrated (similar to the unique index of Oracle ), instead, the clustered index node stores the index itself and stores the relevant information of the clustered index. When searching, find the corresponding clustered index value before searching.

A table can only have one clustered index (data storage is physically stored in the order in which data is imported, which is very valuable ), therefore, we recommend that you fully consider future queries (especially big tables in the Business System) during design. We do not recommend using the idnetity field as a clustered index. This is also strongly recommended by Microsoft.

The efficiency of index search should be:

Index overwrite (which is sometimes not fully implemented)> clustered index seek> index seek

Find out the documentation explanation:

Clustered Indexes

Clustered indexes have one row inSysindexesWithIndid= 1. the pages in the data chain and the rows in them are ordered on the value of the clustered index key. all inserts are made at the point the key value in the inserted row fits in the ordering sequence.

Microsoft SQL Server 2000 indexes are organized as B-trees. each page in an index holds a page header followed by index rows. each index row contains a key value and a pointer to either a lower-level page or a data row. each page in an index is called an index node. the top node of the B-tree is called the root node. the bottom layer of nodes in the index are called the leaf nodes. the pages in each level of the index are linked together in a doubly-linked list. in a clustered index, the data pages make up the leaf nodes. any index levels between the root and the leaves are collectively known as intermediate levels.

For a clustered index,Sysindexes. RootPoints to the top of the clustered index. SQL Server navigates down the index to find the row corresponding to a clustered index key. to find a range of keys, SQL Server navigates through the index to find the starting key value in the range, and then scans through the data pages using the previous or next pointers. to find the first page in the chain of data pages, SQL Server follows the leftmost pointers from the root node of the index.

This practice shows the structure of a clustered index.

Finally, let's summarize:

Clustering indexes and tables are integrated, and the Data Order is orderly (physical order). Therefore, clustering indexes are very effective for searching, in particular, when a large number of consecutive returned result sets are queried, a large number of values are directly returned in sequence (if you use a clustered index, you can imagine how many I/O operations are there ). If a table does not have a clustered index, the data in the table is unordered, which is called Heap Storage (that is, a large amount of data, unordered ). If you create a non-clustered index (with a completely independent structure from the table), the row locator pointing to the data from the index row is called the row locator. The structure of the row locator depends on whether the data page is stored in a stack or a cluster. For a heap set, the row locator is the pointer to the row; for clustering, the row locator is the clustered index key value. Therefore, non-clustered indexes in a clustered table are slower than those in a clustered table. Frequent deletion and modification may cause many fragments in the clustered table. Text column affects the performance of clustered tables. An index in the sqlserve view is a clustered index by default. Therefore, we need to carefully consider this index. We do not need to create it. For more information about this document, see the following link and SQL Server online documentation: Http://topic.csdn.net/t/20051201/11/4430562.html

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.