Focus-Remove Bookmark lookup, RID lookup, Key Lookup improve SQL query performance (vi)

Source: Internet
Author: User
Tags one table

Objective

The previous sections are the basic content, this section we talk about the index performance optimization, when the big data processing when the first thought is the index, once encountered such a problem is rushed, all kinds of search data, why not usually solid basic skills, we are simple and simple, short content, in-depth understanding, rather than a problem to the box to die , immediately give the solution, throw the problem, then to solve the problem, you get No.

Bookmark Lookup, RID lookup, Key lookup definition

When it comes to these three, if the study of the index is not deep on the children's shoes is crazy, what thing, we will leave the above three translated into: Tag lookup, row ID lookup, key lookup. Label lookup and key lookup are a means of calling key lookup before SQL 2005. How to explain, how to define it? First of all, we do not look at the definition, directly look at the following step by step analysis, if you can not help, please see the Garden friends "Forever Red" opinion, explanation or very in place. We briefly explain the concept of the three.

In the query, if we establish a nonclustered index on the returned column in the query condition, we might try to use a nonclustered index lookup, and if the returned column does not create a nonclustered index, it will return to the data page to get the data of those columns, even if there is a clustered index in the table or not. Will be returned to the table or to the clustered index to fetch the data. For the above scenario description, if the table does not create a clustered index then called Bookmar Lookup, if there is no clustered index in the table but there is a nonclustered index we are called RID Lookup. when we see this, we're going to have to go back to the base table to get the data, so we can remove all three to improve query performance. Let's take a look at it next.

Throw Bookmark Lookup, RID lookup, Key lookup problem

We first create the following table

Use TSQL2012 gocreate TABLE sales.orders ([OrderID] int,[shipaddress] VARCHAR (+), [ShipCity] varchar (+), [shipregion] varchar ()GO

Then make a query

' Shenzhen '

This is needless to say, no index is added, and execution of the query plan is a full-table scan. Next we create a clustered index on OrderID as follows:

CREATE CLUSTERED INDEX Idx_cls_orderid on sales.orders (OrderID)

We'll do the above query again.

At this point we create a clustered index, so we query the clustered index at this point, and here we see that the situation is converted to an index scan by a full table scan. We have been in the query with the query conditions, and the query conditions we did not do anything, if we create an index on the query condition at this time, the performance of the query will be a little improved. We start by creating a nonclustered index on the query criteria.

Let's do the query again.

We observed that a nonclustered index was created on the query condition, and the query plan uses a nonclustered index to find the result, but for shipaddress, ShipCity, ShipRegion is not part of the index, and the query engine returns it back to the base table for the data to be returned. This behavior is called Bookmark Lookup or key lookup. Below we will solve the problem as the title of this article, removing the bookmark lookup or key lookup. We try to solve it in two different ways.

Resolving bookmark Lookup, RID lookup, Key Lookup issues Creating a nonclustered index overwrite index

We create a nonclustered index on the query criteria and the retrieved columns.

CREATE nonclustered INDEX idx_all_cover on sales.orders (shipaddress,orderid,shipcity,shipregion)

At this point we create a nonclustered index on the retrieved column, and we will not fetch the data in the data page, but return it directly from the index, so here we have removed key Lookup. But at this point, another problem is triggered, the execution of the query plan is the index scan, what is the index exactly? For example, an index is the index of the beginning of a book in a database, and we need to quickly find the data we need from a book, which is what we call a table. An index scan means that all rows in the table are read, and all the data that satisfies the condition is returned, and when an index scan is performed, all the leaf nodes on all rows are scanned, which means that all rows on the index are retrieved again instead of directly retrieving the table, and the table scan is directly reading the data in the table. So table scans and index scans are a little bit different, and index lookups rely on index page data to locate all rows that satisfy the criteria, and index lookups only affect the rows that satisfy the criteria and the pages that contain those conditions, so index lookups are more efficient.

The above we slightly explained the next index scan and index lookup, and the above problem is that we created a nonclustered index, but the result of the query plan is the index scan, it is very puzzling, for just learning index small white I, do not know how to be good, think is the reason of cache, clear all kinds of cache is not OK. So the beginning of the imagination is not whether the data in the retrieval column is caused by NULL, is not the retrieval of data duplication caused by the column, tried countless times, and finally found that one time actually. As follows

CREATE nonclustered INDEX idx_cls_cover on sales.orders (shipcity,orderid,shipaddress,shipregion)

At this point, if we make the following changes to the query conditions.

shipaddress ' Shenzhen ' GO

Here we should find out that the only difference is that when we create a nonclustered index, the order and the query conditions will cause the conversion of the index scan and the index lookup, then when will the index lookup be performed? We can make the following general summary:

General conclusion of index lookups: If the condition contains where or on, the query condition must be the first in the Index collection column, and the index lookup will be used.

At this point we are interspersed with a bit of content, the above we created the overlay index, we compare the coverage index and the default clustered index lookup performance cost.

Performance cost comparison between overlay index and default clustered index

Use tsql2012goselect OrderID, shipaddress, Shipregionfrom sales.orders with (INDEX ([pk_orders])) WHERE OrderID <10364goselect OrderID, shipaddress, Shipregionfrom sales.orders with (INDEX ([Idx_noncls_cover_ Exceptorderid])) WHERE OrderID<10364GO

From the above, the cost of overwriting an index is better than the performance overhead of the default primary key clustered index, and we can look at the following IO cost.

By comparing the above overlay index with the default clustered index, we can effectively reduce the IO, which is also very clear, of course, the following include index comparison is another good solution.

Creating include nonclustered indexes
Use tsql2012gocreate nonclustered INDEX [ix_noncls_include] on [TSQL2012]. [Sales]. [Orders] (shipcity) INCLUDE (shipaddress, shipregion, OrderID)

So far we've removed the bookmark lookup, RID lookup, Key lookup in two ways, by using the index and overwriting the index.

Since there are two ways, we should have a choice, who has better performance? We then compare the overhead differences between the two above.

Compare two different ways to remove bookmark lookup
' Shenzhen '  ' shenzhen 'GO

We know from the above that there is no difference between the two costs, and of course we are more inclined to think of the second approach as a solution. Here is basically the end, but there is a small problem, we have created a OrderID of the clustered index, and later in the solution we also added the OrderID nonclustered index, do we have to add, we remove the try.

CREATE nonclustered INDEX Idx_noncls_cover_exceptorderidon sales.orders (shipcity,shipaddress,shipregion)
CREATE nonclustered INDEX Idx_noncls_include_exceptorderidon sales.orders (shipcity) include (shipaddress,shipregion)

To remove OrderID Compare the cost differences:

' Shenzhen '  ' shenzhen '  GO

As you know, nonclustered index columns do not need to contain columns that create a clustered index, so what exactly is the fact?

Conclusion: In fact, there is no need for any nonclustered index column to contain the column that created the clustered index, because the column that creates the clustered index is part of the nonclustered Index collection column, which means that the nonclustered index collection column contains the clustered index as long as the columns on one table have a clustered index created.

Summarize

In this section we will be more detailed on the issue to solve the problem, so as to improve the query performance, well, to this end, we'll have a farewell to the next section. Short content, in-depth understanding

Focus-Remove Bookmark lookup, RID lookup, Key Lookup improve SQL query performance (vi)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.