Indirectly implement function index or hash index in SQL Server

Last Update:2017-03-26 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The source of this article: http://www.cnblogs.com/wy123/p/6617700.html

SQL Server does not have a function index, in some scenarios when querying the field according to a certain part of the query or after some kind of calculation to do the query,
If you use functions or other methods to function on a field, you limit the use of the index, but we can indirectly implement functions that resemble function indexes.
The other is that if the query field is large or the number of fields, the index is a bit cumbersome, not high efficiency,
You need to consider using a smaller "substitution" field for equivalent substitution, similar to a hash index,
This paper briefly introduces two solutions to the above two problems, for reference only.

1, the function of "function index" is built by index on computed column.

SQL Server allows the use of computed columns when building a table, which can be used to implement function indexing functions, as illustrated here

Create TableTestfunctionindex (IDint Identity(1,1), Valvarchar( -), Subval as LOWER(SUBSTRING(Val,Ten,4)) persisted--add a persisted computed column)GO--indexing on persisted computed columnsCreate IndexIdx_subvar onTestfunctionindex (subval)GO--inserting 10W rows of test dataInsert  intoTestfunctionindex (Val)Values(NEWID())Go 100000

After you use a function on an indexed field, you cannot use the indexed

If you query directly on a computed column, you can use the index as normal.

By creating an index on a computed column, you can find it based on the index on the computed column, avoiding the use of functions or other operations directly on the field, resulting in a situation where there is no index on the field

Add:
The test magically discovers that if an index is established on a computed column field, the index on the computed column can magically be used when the function on the original field is the same as the function of the computed column.
We can see that SQL Server has done a lot of work in places we haven't noticed.

2, generate a long field or multiple fields hash value instead of the original field to do query or connection to improve query efficiency

Another common situation encountered in development is the long-term use of the query condition field, or the connection condition when the table is connected to more than one field,
Even if there is an index on a field or query condition, it may affect the efficiency of the query because the field is longer or has many conditions.
In this case, it is appropriate to consider generating a smaller field for the original longer field (but to ensure uniqueness), or for multiple fields to generate a shorter data type instead, to improve the efficiency of the query.

For example, if there is such a table, the Name field is simulated by me, name is a long field and is used to retrieve
This means that the query field is long and the index cost is too large, and you need to consider replacing it with a smaller equivalent.
The following is a way to calculate the hash value of a longer field to do equivalent substitution

Simulate generating test data

Create TableTesthashcolumn (IDint Identity(1,1), QueryName nvarchar( -), Hashname as CAST(Hashbytes ('MD2', QueryName) as uniqueidentifier) persisted)GOCreate IndexIdx_hashname onTesthashcolumn (hashname)GO--Here the simulation generates a longer name fieldDECLARE @i int = 0 while @i<10000begin     INSERT  intoTesthashcolumn (QueryName)VALUES(CONCAT ('Beijing New Perspective Science and Culture Media Co., Ltd.',@i))    Set @i = @i+1End

We know that name is nvarchar (100), and this field does not have to be indexed.
If the situation is complicated, the actual may be larger than the field, the index is too wide, resulting in a large index space, the efficiency of a certain degree of influence.

Here you can consider generating an "override" field on the Name field (above Hashname as CAST (hashbytes (' MD2 ', queryname) as uniqueidentifier) persisted this computed column),

This field is preferred to correspond to the actual value one by one, and the requirement that the "override" field type requirements are relatively small,
Of course there are many methods, such as generating a checksum using the checksum function,
But it is actually observed that the checksum generated by checksum is likely to be duplicated, that is to say two different strings, generating the same checksum value
For example, it is easy to verify that this problem can be considered for different strings, and the same checksum is obtained after the calculation

Therefore, you need to consider the uniqueness of the calculated values when generating the "override" field
Here is the use of Hashbytes encryption function, the string encryption, and then the encrypted data generated a uniqueidentifier, the probability of repetition is much smaller.
Demo here by cast (Hashbytes (' MD2 ', ' Beijing New View Technology Culture Media Co., Ltd. 999 ') as uniqueidentifier), you can give this longer field to generate a uniqueidentifier type of field,
Of course, this is not necessarily the only way, can even be done with complex, as long as it can guarantee a unique long field generated by the short field is also unique can achieve the purpose of
Refer to the following query, you can use Hashname calculated values and computed columns to compare, to a certain extent, can reduce the size of the index of the search field, but also to achieve the purpose of the effect

For example, you can use the index on the Hashname field, and also avoid building indexes on the long field of the original queryname, saving space and improving query efficiency

3, when the logical primary key is multiple fields, a unique field of "substitution" is generated on more fields

In some cases, business requirements or designs (such as not reaching the third paradigm, BC Paradigm, fourth paradigm, or even the fifth paradigm) tend to have multiple fields when the table is connected
This looks like this:

SELECT * fromTablenamea aINNER JOINTablenameb b onA.Key=B.Key         andA.type=B.type andA.status=B.staus andA.creationtime=B.creationtime andA.***=B.***where ***

When the table is associated, there are many connection conditions,
If this is the case, the best thing is to build a wide composite index,
But in this case, the width and volume of the index becomes large, and the efficiency has a certain effect when used.
This situation can be considered on Tablenamea and Tablenameb,
Using multiple connected fields (Key+type +status +creationtime+***) to create an index on a computed column similar to a computed column in Example 2
And then when the table is connected, you can replace it in the following way

SELECT *  from Tablenamea a INNER JOIN Tablenameb b      on A.hashvalue=b.hashvalueWHERE* * *

Always, this is a space-time thinking (redundancy stores an identifier-like field to improve query efficiency),
The idea of generating an "override" field is two points, the first one is small enough, the second is the uniqueness of the original Value generation alternative field

Summary: There is no function index and hash index in SQL Server, and some business requirements, or for performance reasons, require similar functionality.
By means of a space-time-based approach, it is possible to implement functionality similar to a function index or hash index, which has achieved the effect of function indexes and hash indexes in other databases (although the rationale may be different).
Note that you should pay attention to the calculation method when generating the computed column or the hash value substitution, ensuring the uniqueness of the key value after the generation
Of course, the implementation of the way you can choose according to the need, all roads through Rome.

Indirectly implement function index or hash index in SQL Server

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Indirectly implement function index or hash index in SQL Server

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Indirectly implement function index or hash index in SQL Server

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support