SQLServer Performance Optimization-indirect implementation of function indexes or Hash indexes, sqlserverhash

Last Update:2017-04-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

SQL Server does not have function indexes. In some scenarios, You need to perform a query based on a certain part of the field or perform a query after some calculation, if a function or other method is used on a field, it will limit the use of the index. However, we can indirectly implement functions similar to function indexes.

The other one is that if the query field is large or there are many fields, the index is a little bulky and inefficient, you need to consider using a small "alternative" field for equivalent replacement, similar to the Hash index,

This article briefly introduces two solutions to the above two problems, for your reference only.

1. Create an index on the computing column to implement the function index function.

SQL Server allows you to use a computed column when creating a table. You can use this computed column to implement the function index function. Here is an example.

Create Table TestFunctionIndex (id int identity (1, 1), val varchar (50), subval as LOWER (SUBSTRING (val, 10, 4) persisted -- add a persistent computing column) GO -- create index idx_subvar on TestFunctionIndex (subval) GO -- insert 10 million rows of test data insert into TestFunctionIndex (val) values (NEWID () go 100000

Indexes cannot be used after functions are used on indexed fields.

If you directly query the computed column, you can use the index normally.

By creating an index on the calculated column, you can search for the index on the calculated column to avoid directly using functions or other operations on the field, this results in the failure to use indexes even on fields.

Supplement:

During the test, it was discovered that if an index is created on the calculated column field, the index on the calculated column can be magically used when the function on the original field is the same as the function on the calculated column. It can be seen that SQLServer has made a lot of effort in areas we didn't pay attention.

2. Generate the Hash values of long fields or multiple fields to replace the original fields for query or connection to improve the query efficiency.

Another common problem encountered during development is that the query condition fields that are frequently used are long, or there are many connection condition fields when the table is connected,

Even if there is an index on a field or query condition, the query efficiency may be affected because the field is long or has many conditions.

In this case, we should take appropriate consideration to generate a smaller field (but ensure uniqueness) for the original long field, or to generate a shorter data type for multiple fields, to improve query efficiency

For example, if there is such a table, the Name field is simulated by me, and the Name field is a long field, which must be used for retrieval.

This means that the query field is long and the index cost is too high. In this case, you need to consider replacing it with a small equivalent field.

The following code calculates the Hash value of a long field in a certain way for equivalent replacement.

Generate Test Data

Create table testHashColumn (id int identity (100), QueryName nvarchar (), HashName as cast (HASHBYTES ('md2', QueryName) as uniqueidentifier) persisted) GOcreate index idx_HashName ON testHashColumn (HashName) GO -- a long name field DECLARE @ I int = 0 while @ I <10000 begin INSERT INTO testHashColumn (QueryName) is generated here) VALUES (CONCAT ('Beijing New Viewpoint Technology and Culture Media Co., Ltd. ', @ I) set @ I = @ I + 1end

We know that the Name of Name is nvarchar (100). It is not impossible to index this field. If the situation is complicated, it may be larger than this field, and the index is too wide, the index space is too large, which may affect the efficiency to a certain extent.

Here we can consider generating an "alternative" field on the Name field (the above HashName as cast (HASHBYTES ('md2', QueryName) as uniqueidentifier) persisted ),

The first choice for this field is to correspond to the actual values one by one. In addition, the field type requirements for "substitution" are relatively small. Of course, there are also many methods, for example, the checksum function is used to generate a check value. However, it is observed that the checksum generated by the checksum function may be duplicated. That is to say, two different strings generate the same check value.

For example, it is easy to verify this problem. It can be considered that for different strings, the same checksum is obtained after calculation.

Therefore, when generating an "alternative" field, you must consider the uniqueness of the calculated value.

The HASHBYTES encryption function is used to encrypt the string and generate a UNIQUEIDENTIFIER for the encrypted data. The probability of duplication is much smaller.

Demonstrate how to use CAST (HASHBYTES ('md2', 'Beijing New Viewpoint technology culture media Co., Ltd. 999 ') as uniqueidentifier, you can generate a UNIQUEIDENTIFIER type field for this long field,

Of course, this method may not be the only method, or even complicated. As long as it can ensure that a unique long field generates a shorter field, it can be unique to achieve the goal.

By referring to the following query, you can use the value calculated by HashName to compare it with the calculated column. To a certain extent, you can reduce the size of the field index and achieve the goal.

For example, you can use the index on the HashName field, and avoid creating an index on the long field of the original QueryName, which saves space and improves query efficiency.

3. When the logical primary key is multiple fields, an alternative unique field is generated on the extra fields.

In some cases, business needs or designs (for example, the third paradigm, the BC paradigm, the fourth paradigm, or even the fifth paradigm) usually have multiple fields during table join.

For example:

SELECT *FROM TableNameA aINNER JOIN TableNameB b  ON a.key=b.key    AND a.Type = b.Type    AND a.Status = b.Staus    AND a.CreationTime = b.CreationTime    AND a.***=b.***where ***

When a table is associated, there are many join conditions. In this case, the best case is to create a wide composite index. However, in this case, the width and size of the index become very large, the efficiency also has a certain impact. In this case, you can consider using multiple connected fields (Key + Type + Status + CreationTime + ***) on TableNameA and TableNameB to create a computing column similar to the one in example 2, create an index on the computing column, and then replace it with the following method when you connect to the table.

SELECT *FROM TableNameA aINNER JOIN TableNameB b  ON a.HashValue=b.HashValueWHERE ***

Always, this is an idea of changing the time of space (storing a field similar to an identifier in redundancy to improve the query efficiency). There are two ideas for generating "alternative" fields, first, it must be small enough. Second, it must generate the original value to replace the uniqueness of the field.

Summary:SQL Server does not have function indexes or Hash indexes. Some business requirements or performance considerations require similar functions, which can be achieved through a space-to-time approach, it can be changed to implement functions similar to function indexes or Hash indexes, which have achieved the effects of function indexes and Hash indexes in other databases (although the principles may be different ). Note that you must pay attention to the calculation method when generating a calculated column or replacing the Hash value to ensure the uniqueness of the generated Key value. Of course, the implementation method can be selected as needed, and all the major roads will go to Rome.

The above is all the content of this article. I hope this article will help you in your study or work. I also hope to provide more support to the customer's home!

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

SQLServer Performance Optimization-indirect implementation of function indexes or Hash indexes, sqlserverhash

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

SQLServer Performance Optimization-indirect implementation of function indexes or Hash indexes, sqlserverhash

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support