SQL Server Index and table architecture (nonclustered indexes)

Source: Internet
Author: User

Original: SQL Server index and table Architecture (nonclustered index)

Nonclustered indexes

Overview

for nonclustered indexes, the information involved is a bit more than the clustered index, because the entire length of the page is much more involved in the next "index of the included column ", "Index fragmentation" and some other knowledge points, may be combined to read to understand more easily. Nonclustered indexes and clustered indexes are B-tree structures, but nonclustered indexes do not change the way data is stored, so a table allows multiple nonclustered indexes to be built; the leaf layer of a nonclustered index is made up of index pages instead of data pages, index rows contain index key values and row locators pointing to the table data storage location.

You can either use a clustered index to define a nonclustered index for a table or view, or you can define a nonclustered index based on the heap. Each index row in a nonclustered index contains a nonclustered key value and a row locator. This locator points to a clustered index or a data row in the heap that contains the key value.

Body

    • Nonclustered index structure in a single partition

Nonclustered index index_id>1 can be combined with a statement query

SELECTO.name asTABLE_NAME,P.INDEX_ID, I.name asIndex_name, Au.type_desc asAllocation_type, Au.data_pages, partition_number,p.rows,x.first_page,x.root_page,x.first_iam_page,x.filegroup_id, X.total_pages,x.used_pages fromSys.allocation_units asauJOINSys.partitions asP onau.container_id=p.partition_idJOINSys.objects asO onP.object_id =O.object_id    JOINSys.indexes asI onp.index_id=i.index_id andI.object_id =P.object_id    JoinSys.system_internals_allocation_units asX onau.container_id=x.container_idORDER  byO.name, p.index_id;

A row locator in a nonclustered index row, or a pointer to a row, or a clustered index key for a row, as described below:

    • If the table is a heap (meaning that the table does not have a clustered index), the row locator is a pointer to the row. The pointer is generated by the file identifier (ID), the page number, and the row count on the page. The entire pointer is called the row ID (RID).

If the table has a clustered index or a clustered index on the indexed view, the row locator is the clustered index key for the row. If the clustered index is not a unique index, SQL Server adds internally generated values (called unique values ) so that all duplicate keys are unique. This four-byte value is not visible to the user. This value is added only if you need to make the clustered key unique for use in a nonclustered index. SQL Server retrieves a data row by using the clustered index key stored in a leaf row of a nonclustered index to search for a clustered index.

    • A nonclustered index is compared to a clustered index:

A) leaf nodes are not data nodes.
B) The leaf node stores a "key-pointer" pair for each real data row.
C) The leaf node also stores a pointer offset, which can be positioned to a specific data row based on the page pointer and pointer offset.
D) Similarly, the other index nodes outside the leaf node are stored in a similar context, except that it refers to the index page at the next level.

A clustered index is a sparse index, and the index page on the top level of the data page stores the page pointer, not the row pointer. For a nonclustered index, it is a dense index that stores an index record for each row of data at the top-level index page of the data page.

Note: The data page in is a clustered index or a heap data row, not a nonclustered index data page, there is no data page in the nonclustered index, the leaf layer and the root node in the nonclustered index are a little different from the middle node, and its pointer is to the data row, and if the nonclustered index is the containing column index, The containing column is stored only at the leaf level, and the key value can be stored at all levels, and this block is described in the next included column index.

For index records of the root and intermediate levels, its structure includes:
A) Indexed field values
B) RowId (that is, the page pointer + pointer offset of the corresponding data page). The index page in the upper level contains rowid to precisely locate the data row when the data is changed when the index allows duplicate values.
C) pointers to next-level index pages

For the Index object of the leaf layer, its structure includes:
A) Indexed field values
B) RowId

Because index builds are stored in index pages, retrieving individual index key values is very efficient because you do not need to navigate to the data page to find the data in the index page, and the space for the Jianjian index nonclustered index is smaller than the clustered index because nonclustered indexes do not need to store the data rows, except for the built-in full coverage index.

    • Selection of nonclustered index columns
    1. contains columns that are frequently included in the query's search criteria (for example, a WHERE clause that returns an exact match)
    2. often acts as join or GROUP by clauses
    3. try to avoid using a composite column to index, unless the combination column is used in where, you can use the Include column index instead of the composite index, select the combined field to do the index, The first field selection of a combined field is important, the first field must be frequently used fields, such as the AB field as the combined field, when the where with a field as the search condition, the query will use the index lookup; When you use B as a search condition for where, the query uses an index scan, Although we cannot be absolutely certain that the efficiency of the search is better than scanning, it is also telling us to choose the order of the indexed columns and even the columns.
    4. A large number of distinct values, such as a combination of last name and first name (provided that the clustered index is used for other columns). Do not choose such as the gender of the duplicate value of the column, this situation table scan is more efficient than the search, so sometimes when we use the query plan analysis is not necessarily a scan will be worse than the search, we have to analyze the problem according to the actual situation. The
    5. overwrites the query.
      When an index contains all the columns in a query, performance can be improved. The query optimizer can find all the column values within the index and does not access the table or clustered index data, which reduces disk I/O operations. Use an index with a containing column to add an overlay column instead of creating a wide index key. For more information, see Indexes with included columns.
      If the table has a clustered index, the columns defined in the clustered index are automatically appended to the end of each nonclustered index on the table. This can generate an overwrite query instead of specifying a clustered index column in the nonclustered index definition. For example, if a table has a clustered index on a C column, the nonclustered indexes for b and A columns will have their own key-value columns B , Strong>a and C

There is no absolutely perfect thing in the world, index is same, bring us query efficiency also have the disadvantage

    • A large number of indexes on a table affects the performance of the INSERT, UPDATE, DELETE, and MERGE statements, because all indexes need to be appropriately adjusted when the data in the table changes

Summarize

This article is more important is to tell the index of the storage structure and find the way, not to tell the index of some basic concepts and the wording of the statement, there are many online writing is very good in this aspect of the article. Hope to write this article can bring you help, the article is some of the content is copied from the other author, because I think the original author (kissknife) In this respect has been described very in place, so borrowed a bit, also if the article has the unreasonable place also hope that you put forward.

Note:

pursuer.chen

Blog:http://www.cnblogs.com/chenmh

This site all the essays are original, welcome to reprint, but reprint must indicate the source of the article, and at the beginning of the article clearly to the link, otherwise reserves the right to hold responsibility.

Welcome to the exchange of discussions

SQL Server Index and table architecture (nonclustered indexes)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.