Using indexes to improve the efficiency of SQL Server data processing

Source: Internet
Author: User
Tags join range require sort

Based on good database design, efficient use of indexes is the basis for SQL Server to achieve high performance, SQL Server employs a cost-based optimization model, which queries each submitted table, determines whether to use the index or the index. Because most of the overhead of query execution is disk I/O, one of the primary goals of using index to perform performance is to avoid full table scans, because full table scans require every page of the table to be read from disk, and if an index points to a data value, the query only needs to read a few disks. So if a reasonable index is established, the optimizer can use the index to speed up the query process of the data. However, the index does not always improve the performance of the system, in addition, deletion, the operation of the index will increase the presence of a certain amount of work, so, in the appropriate place to add the appropriate index and never reasonable place to remove the suboptimal index, will help optimize those poor performance of SQL Server applications. The practice shows that the reasonable index design is based on the analysis and prediction of various queries, and only if the index is combined with the program correctly can the optimal scheme be produced. This article makes some analysis and practice on the performance of SQL Server indexes.

First, the use of clustered index (clustered indexes)

A clustered index is a type of actual data on a disk that is organized to sort by the values of one or more of the specified columns. Because the indexed page pointer of a clustered index points to a data page, using a clustered index to find data is almost always faster than using a nonclustered index. Only one clustered index can be built per table, and a clustered index requires at least the additional space of the table 120% to hold a copy of the table and index the middle page. The idea of establishing a clustered index is:

1, most tables should have a clustered index or use partitioning to reduce the competition for the end of the table, in a high transaction environment, the last page of the blockade seriously affect the system's throughput.

2. Under the clustered index, the data is physically ranked on the data page, and the duplicate values are grouped together, so that when the query that contains the scope check (between, <, <=, >, >=) or the group by or order by is used, Once a row with the first key value in the range is found, rows with subsequent index values are guaranteed to be physically contiguous without further searching, avoiding a wide range of scans and greatly improving query speed.

3, in a frequent insert operation of the table to establish a clustered index, do not build in a monotonous appreciation of the column (such as identity), otherwise it will often cause blockade conflicts.

4. Do not include frequently modified columns in the clustered index, because the data rows must be moved to a new location after the code value is modified.

5, select the clustered index should be based on the WHERE clause and the type of join operation.

The candidate columns for the clustered index are:

1, the primary key column, which is used in the WHERE clause and is randomly inserted.

2. Columns accessed by range, such as Pri_order > Pri_order < 200.

3. Columns used in GROUP by or order by.

4. Columns that are not frequently modified.

5, the columns used in the connection operation.

Use of non-clustered index (nonclustered indexes)

The indexes that SQL Server establishes by default are nonclustered indexes, and because non-clustered indexes do not rearrange the data in the table, they store indexed column values for each row and point to the page where the data resides. In other words, a nonclustered index has an extra level between the index structure and the data itself. A table can have 250 nonclustered indexes if it does not have a clustered index. Each nonclustered index provides a different sort order for accessing data. When the nonclustered index is established, it is necessary to weigh the advantages and disadvantages of the index to speed up the query and reduce the speed of modification. In addition, these issues need to be considered:

1, the index needs to use how much space.

2, the appropriate column is stable.

3, the index key is how to choose, the scanning effect is better.

4, whether there are many duplicate values.

For frequently updated tables, nonclustered indexes on tables require more overhead than clustered indexes and no indexes at all. For each row that is moved to a new page, the page-level rows for each nonclustered index to that data must also be updated, and may sometimes require the index page's factoring. The process of deleting data from one page also has similar overhead, and the deletion process must also move the data to the top of the page to ensure continuity of the data. Therefore, it is very prudent to establish nonclustered indexes. Non-clustered indexes are often used in the following situations:

1, a column is commonly used in aggregate functions (such as Sum,....).

2, a column commonly used in Join,order by,group by.

3, the search out of the data does not exceed the table in the amount of data 20%.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.