SQL Server focuses on filtering indexes and SQL Server Indexes

Source: Internet
Author: User
Tags sql 2008

SQL Server focuses on filtering indexes and SQL Server Indexes

Preface

In this section, we will continue to talk about index knowledge. We have clustered indexes, non-clustered indexes, and covered indexes. Here we also have a filter index, which can improve query performance through index filtering, brief content and in-depth understanding.

Filter indexes and create non-clustered indexes on query conditions (1)

As a new feature of SQL 2008, index filtering is applied to some rows in the Table. Therefore, using the index filtering can increase the query speed, which reduces the cost of index maintenance and index storage compared with full table scanning. When we apply the WHERE condition on the index, the index is filtered. That is, the following format is met:

CREATE NONCLUSTERED INDEX <index name>ON <table> (<columns>)WHERE <criteria>;GO

Let's look at a simple query.

USE AdventureWorks2012GOSELECT SalesOrderDetailID, UnitPriceFROM Sales.SalesOrderDetailWHERE UnitPrice > 2000GO

No index is created in the above columns. Of course, except for the clustered index created by SalesOrderDetailID by default, in this case, we can assume that the query plan executed by SalesOrderDetailID must be a clustered index scan created by the primary key, as shown below:

We have already said that no index is created on the query condition at this time. Therefore, the clustered index created by the primary key is required at this time. Next, we will first create a non-clustered index on the UnitPrice column to improve the query performance,

CREATE NONCLUSTERED INDEX idx_SalesOrderDetail_UnitPriceON Sales.SalesOrderDetail(UnitPrice)

Now we will compare the query overhead of both.

USE AdventureWorks2012GODBCC FREEPROCCACHEDBCC DROPCLEANBUFFERSSELECT SalesOrderDetailID, UnitPriceFROM AdventureWorks2012.Sales.SalesOrderDetail WITH(INDEX([PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID]))WHERE UnitPrice > 2000GOSELECT SalesOrderDetailID, UnitPriceFROM Sales.SalesOrderDetail WITH(INDEX([idx_SalesOrderDetail_UnitPrice]))WHERE UnitPrice > 2000

After a non-clustered index is set up on the query condition, the query overhead is significantly increased to more than 90%, because the non-clustered index also references the clustered index created by the primary key, this will not result in Bookmark Lookup or Key Lookup. Next, we will add a conditional non-clustered index to filter the index.

CREATE NONCLUSTERED INDEX idxwhere_SalesOrderDetail_UnitPriceON Sales.SalesOrderDetail(UnitPrice)WHERE UnitPrice > 1000

Now let's take a look at the performance overhead difference between the created index and the non-clustered index:

USE AdventureWorks2012GODBCC FREEPROCCACHEDBCC DROPCLEANBUFFERSSELECT SalesOrderDetailID, UnitPriceFROM AdventureWorks2012.Sales.SalesOrderDetail WITH(INDEX([idx_SalesOrderDetail_UnitPrice]))WHERE UnitPrice > 2000SELECT SalesOrderDetailID, UnitPriceFROM Sales.SalesOrderDetail WITH(INDEX([idxwhere_SalesOrderDetail_UnitPrice]))WHERE UnitPrice > 2000

At this point, we know that the non-clustered filter index we created is nearly halved compared with the traditional non-clustered index.

Unique filter Index

The unique filter index is also a good solution for all columns that must be unique and not empty (only one NULL is allowed). Therefore, when creating a unique filter index, the NULL value must be excluded, for example:

CREATE UNIQUE NONCLUSTERED INDEX uq_fix_Customers_EmailON Customers(Email)WHERE Email IS NOT NULLGO

Filter indexes and INCLUDE

When we add an additional column and use the clustered index created by the default primary key, the clustered index scan is performed, and then we create a filter index on the query condition, when we use this filter index forcibly, adding additional columns will cause the data to be returned to the base table before obtaining the data. This causes the Key Lookup to look up, as shown below:

USE AdventureWorks2012GOSELECT SalesOrderDetailID, UnitPrice, UnitPriceDiscountFROM Sales.SalesOrderDetailWHERE UnitPrice > 2000GO

In this case, we need to use INCLUDE to INCLUDE additional columns.

CREATE NONCLUSTERED INDEX [idx_SalesOrderDetail_UnitPrice] ON Sales.SalesOrderDetail(UnitPrice) INCLUDE(UnitPriceDiscount)

Create a filter index and include additional columns.

CREATE NONCLUSTERED INDEX [idxwhere_SalesOrderDetail_UnitPrice] ON Sales.SalesOrderDetail(UnitPrice) INCLUDE(UnitPriceDiscount)WHERE UnitPrice > 2000

Next, compare the performance query differences between the added and excluded filter indexes.

SELECT SalesOrderDetailID, UnitPrice, UnitPriceDiscountFROM AdventureWorks2012.Sales.SalesOrderDetail WITH(INDEX([idx_SalesOrderDetail_UnitPrice]))WHERE UnitPrice > 2000 SELECT SalesOrderDetailID, UnitPrice, UnitPriceDiscountFROM Sales.SalesOrderDetail WITH(INDEX([idxwhere_SalesOrderDetail_UnitPrice]))WHERE UnitPrice > 2000

In this case, the performance of INCLUDE to INCLUDE extra columns is also improved.

Filter indexes and create non-clustered indexes on the primary key (2)

In the first case column, we can directly create a non-clustered index on the query column because it is of the numeric type. What if the query condition is of the character type? First, create a test table.

USE TSQL2012GOCREATE TABLE dbo.TestData (  RowID    integer IDENTITY NOT NULL,   SomeValue  VARCHAR(max) NOT NULL,     StartDate  date NOT NULL,  CONSTRAINT PK_Data_RowID    PRIMARY KEY CLUSTERED (RowID));

Add 0.1 million pieces of test data

USE TSQL2012GOINSERT dbo.TestData WITH (TABLOCKX)  (SomeValue, StartDate)SELECT  CAST(N.n AS VARCHAR(max)) + 'JeffckyWang',  DATEADD(DAY, (N.n - 1) % 31, '20140101')FROM dbo.Nums AS NWHERE   N.n >= 1   AND N.n < 100001;

If we need to obtain SomeValue = 'jeffckywang' in the TestData table, we want to create a non-clustered index on SomeValue and then filter it, as shown below:

USE TSQL2012GOCREATE NONCLUSTERED INDEX idx_noncls_somevalueON dbo.TestData(SomeValue)WHERE SomeValue = 'JeffckyWang'

Update

SQL Server has a limit on the size of indexes to be created. The maximum value is 900 bytes. The preceding VARCHAR (MAX) is written directly, so errors may occur. Remember, remember.

In this case, we create a non-clustered index on the primary key. We create a filter index on the primary key RowID and SomeValue = 'jeffckywang', and then return the data as follows:

CREATE NONCLUSTERED INDEX idxwhere_noncls_somevalueON dbo.TestData(RowID)WHERE SomeValue = 'JeffckyWang'

Next we will compare the results of the query plan before and after the filtering index is created:

USE TSQL2012GOSELECT RowID, SomeValue, StartDate FROM dbo.TestData WITH(INDEX([idx_pk_rowid]))WHERE SomeValue = 'JeffckyWang'SELECT RowID, SomeValue, StartDate FROM dbo.TestData WITH(INDEX([idxwhere_noncls_somevalue]))WHERE SomeValue = 'JeffckyWang'

Then, remove the Key Lookup and INCLUDE the created filter indexes.

CREATE NONCLUSTERED INDEX [idxwhere_noncls_somevalue] ON dbo.TestData(RowID) INCLUDE(SomeValue,StartDate) WHERE SomeValue = 'JeffckyWang'

From this we can see that whether it is to create a filter index for the query condition or to create a filter index for the primary key, we can improve the query performance by combining what we have learned before.

We have been talking about creating a filter index since the beginning. What are the conditions for creating a filter index?

(1) only non-clustered indexes can be created.

(2) If you create a filter index on the view, the view must be a persistent view.

(3) You cannot create a filter index on the full-text index.

Advantages of index Filtering

(1) Reduce index maintenance costs: adding, deleting, modifying, and other operations are not costly, because rebuilding a filter index does not take much time.

(2) reduce storage costs: the storage space used for filtering indexes is small.

(3) More accurate statistics: creating a filter index on the WHERE condition is more accurate than the full table statistical results.

(4) Optimize Query performance: the query plan shows its efficiency.

So far, I have always stated the benefits and advantages of filtering indexes, and I have already put them on the list. In fact, its shortcomings are also obvious.

Disadvantages of filtering Indexes

The biggest drawback is the query condition restriction. Its query conditions are limited

<filter_predicate> ::=    <conjunct> [ AND <conjunct> ] <conjunct> ::=   <disjunct> | <comparison>  <disjunct> ::=     column_name IN (constant ,...n)

The filter conditions are limited to AND, |, and in. The comparison condition IS limited to {IS | is not |=|<>|! = | >|>=|!> | <= |! <}, So we cannot use LIKE as follows.

CREATE NONCLUSTERED INDEX [idxwhere_noncls_somevalue] ON dbo.TestData(RowID) INCLUDE(SomeValue,StartDate) WHERE SomeValue LIKE 'JeffckyWang%'

You can

USE AdventureWorks2012GOCREATE NONCLUSTERED INDEX idx_SalesOrderDetail_ModifiedDateON Sales.SalesOrderDetail(ModifiedDate)WHERE ModifiedDate >= '2008-01-01' AND ModifiedDate <= '2008-01-07'GO

But not as follows:

CREATE NONCLUSTERED INDEX idx_SalesOrderDetail_ModifiedDateON Sales.SalesOrderDetail(ModifiedDate)WHERE ModifiedDate = GETDATE()GO

Influence of variables on index Filtering

In the preceding example, we create a filter index that is directly defined on the query condition, as follows:

CREATE NONCLUSTERED INDEX idxwhere_SalesOrderDetail_UnitPriceON Sales.SalesOrderDetail(UnitPrice)WHERE UnitPrice > 1000

What if we define a variable and compare it with it? First, create a filter index.

CREATE NONCLUSTERED INDEX idx_SalesOrderDetail_ProductID ON Sales.SalesOrderDetail (ProductID)WHERE ProductID = 870

Use variables to compare with query conditions and forcibly use the filter index (clustered index is used by default)

USE AdventureWorks2012GODECLARE @ProductID INT SET @ProductID = 870 SELECT ProductID FROM Sales.SalesOrderDetail WITH(INDEX([idx_SalesOrderDetail_ProductID]))WHERE ProductID = @ProductID

An error occurred while viewing the query execution plan results. Now we need to add OPTION to recompile, as shown below:

USE AdventureWorks2012GODECLARE @ProductID INT SET @ProductID = 870 SELECT ProductID FROM Sales.SalesOrderDetailWHERE ProductID = @ProductIDOPTION(RECOMPILE)

The preceding variables are used to query and re-compile SQL Server 2012 using OPTION. For details about other versions, see [The Pains of Filtered Indexes ].

Summary

This section describes how to improve query performance by filtering indexes. It also provides different scenarios and advantages and obvious disadvantages. For a brief introduction and a deep understanding, we will try again later in the next section, good night.

The above is all the content of this article. I hope the content of this article will help you in your study or work. If you have any questions, you can leave a message and share it with us!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.