"Translated" SQL Server Introduction eighth: Unique index

Source: Internet
Author: User
Tags bulk insert create index one table management studio sql server management sql server management studio

Text: "Translated" SQL Server Introduction stage eighth: Unique index

Index design is a relatively important part of database design, the performance of the database is a crucial role, but the design of the index is not so easy things, performance is not so easily acquired, a lot of technical staff because inappropriate to create the index, and finally make its effect counterproductive, can say "into the index , the loser is also indexed. "

This series of articles from Stairway to SQL Server Indexes, translated and collated post in Agilesharp and blog Park, hoping to help the vast number of technical friends in how to use the index.

Unique Indexes and constraints

Unique indexes are not fundamentally different from other indexes, except that unique indexes do not allow the same values in the index keys. Because each entry in the index corresponds to a row in the table. A unique index that does not allow duplicate values to be inserted into the index also guarantees that the corresponding row does not allow the table in which the index is inserted, which is why the unique index implements the primary key and candidate key.

When you declare a PRIMARY KEY or UNIQUE constraint for a table, SQL Server automatically creates a unique index corresponding to it. You can create a unique index without a unique constraint, but vice versa. When you define a constraint, SQL Server automatically creates an index with the same name, and you cannot delete the index before deleting the constraint. However, you can delete the constraint, and the deletion of the constraint will cause the index associated with it to be deleted.

Each table can contain multiple unique indexes. For example, AdventureWorks's product table contains four unique indexes, PRODUCTID,PRODUCTNUMBER,ROWGUID and Productnamecolumn, respectively, The person setting up the product table ProductID as the primary key, and the other three as the candidate key.

You can create a unique index by creating an index statement, such as:

CREATE UNIQUE nonclustered INDEX [ak_product_name] on production.product ([Name]);

You can also create a unique index by directly defining the constraint:

ALTER TABLE production.product ADD CONSTRAINT pk_product_productid PRIMARY KEY CLUSTERED (ProductID);

In the first method above, you cannot have the same ProductName in your Prodcut table, and the second case table does not allow the same ProductID.

Because defining a primary key or defining a constraint causes the index to be created, you must give the necessary index information when the constraint is defined, so the ALTER TABLE statement above contains the "CLUSTERED" keyword.

If the column constrained by a unique index or constraint already contains duplicate values in the current table, creating the index will fail.

When the unique index is created successfully, all DML statements that violate this constraint will fail, for example, we are going to add a ProductName that exists in the current table, as in the following statement:

INSERT production.product
(Name,
ProductNumber,
Color,
Safetystocklevel,
ReorderPoint,
StandardCost,
ListPrice,
Size,
SizeUnitMeasureCode,
WeightUnitMeasureCode,
[Weight],
DaysToManufacture,
ProductLine,
Class,
Style,
Productsubcategoryid,
ProductModelID,
Sellstartdate,
SellEndDate,
Discontinueddate
)
VALUES (' Full-finger gloves, M ',
' A unique product number ',
' Black ',
4,
3,
20.00,
40.00,
' M ',
Null
Null
Null
0,
' M ',
Null
' U ',
20,
3,
GETDATE (),
GETDATE (),
Null
) ;


Code 1: The same ProductName exists in the inserted row and table

After the above code execution, we can see the following error message:





The above message tells us that the AK_Product_Name index does not allow us to insert data that contains the same ProductName as in the current table.

Primary key, UNIQUE constraint, and no constraint

Primary KEY constraints and UNIQUE constraints have the following small differences.

    • A PRIMARY KEY constraint does not allow null values. Index keys for any index are not allowed to contain null values. But a unique constraint allows a null value to be included, but a unique constraint treats two null values as duplicates, so each column that imposes a unique constraint is allowed to contain only one null value.
    • The clustered index is automatically created when the primary key is created, unless the current table already contains a clustered index or the nonclustered keyword is specified when the primary key is created.
    • A nonclustered index is automatically created when a unique constraint is created, unless you specify the clustered keyword and the current table does not have a clustered index.
    • There can be only one primary key in each table, but more than one unique constraint.

For the selection of unique constraints and unique indexes, refer to the guidance on MSDN, as follows:

There is no significant difference between a unique constraint and a unique index. There is no difference in how data is validated by creating independent unique indexes and using unique constraints. The query optimizer also does not differentiate whether unique indexes are created by constraints or created manually. However, with the goal of data integrity, it is better to create constraints, which make the target of the corresponding index at a glance.

Mixing unique and filtered indexes

We mentioned above that a unique index allows only one null value, but this conflicts with common business requirements. Many times we do not allow duplicates for values that already exist, but allow multiple columns with no values.

For example, you are a supplier and all of your products come from third-party vendors. You will have all of your product information in a table called Productdemo. You have your own ProductID, and you also track the product's UPC (Universal product Code) value. But not all vendor products have UPC, and some of the data in your table is shown below.

ProductID Upcode Other Columns
Primary key Unique index
14aj-w

036000291452

23cz-m
23cz-l
18mm-j

044000865867

Table 1. Productdemo part of the table

In the second column of the table above, you have to guarantee the uniqueness of the Upcode and allow null values. The best way to achieve this is to mix unique indexes and filtered indexes (filtered indexes are actually introduced in SQL Server 2008).

As a demonstration, we created the table shown in table 1.







Next we insert the data as shown above.








When we insert duplicate values


Receive the following error





(Translator note, here the original author should be negligent, slightly pit dad, because he did not create filter unique index, so according to the original text will not error, I added here, code: CREATE UNIQUE nonclustered INDEX xx on Productdemo (Upcode) where Upcode!=null)

Select the appropriate ignore_dup_key option

When you create a unique index, you can specify the IGNORE_DUP_KEY option, so the option to create a unique index at the beginning of this article can be:


Ignore_dup_key the name is easy to misunderstand. Duplicate values are never ignored when a unique index exists. More precisely, duplicate keys are never allowed in a unique index. This option works only when multiple columns are inserted.

For example, you have two tables, and table A and table B have exactly the same structure. You may submit the following statement to SQL Server.

INSERT into TableA SELECT * from TableB;

SQL Server attempts to insert data from all table B into table A. But what if a unique index denies that table B contains the same data as table A? Do you want to just repeat data insertions unsuccessfully, or is the entire INSERT statement unsuccessful?

This depends on the ignore_dup_key parameter you set, and when you create a unique index, you can set parameters to what to do when the insertion is unsuccessful, and the two parameters for setting Ignore_dup_key are interpreted as follows:

Ignore_dup_key=off

The entire INSERT statement does not succeed and the error message pops up, which is also the default setting.

Ignore_dup_key=off

Only those lines that have duplicate keys are unsuccessful, and all other rows succeed. and a warning message pops up.

The IGNORE_DUP_KEY option affects only the INSERT statement. And will not be affected by Update,create Index,alter INDEX. This option can also be set when setting a primary key and a unique constraint.

Why unique indexes can provide additional performance gains

A unique index can provide an unexpected performance boost. This is because a unique index gives SQL Server the information to ensure that a column has absolutely no duplicate values. The two unique indexes of ProductID and ProductName in the product table of Adventurework provide a good example.

Join, one of your company's Data Warehouse buddies wants you to provide him with some information about the product table, which is required as follows:

    • Product Name
    • Quantity of products sold
    • Total Sales

Therefore, you have written the following query statement:






GROUP by P.name

(Translator note, the original author here has a problem with the code, ProductID replaced by P.name)

Data Warehouse buddies are satisfied with your query statement, each line contains the product name, sales volume and total sales, the results of the query is as follows:

However, you are concerned about the cost of this query. SalesOrderDetail is a larger table in the two tables in the above query, and is also grouped by ProductName, which is ProductName from the product table instead of the SalesOrderDetail table.

With SQL Server Management Studio, you notice that the SalesOrderDetail table has a primary key, and the primary key is also the clustered index key, which is SalesOrderID and Salesorderdetailid, This primary key does not give a performance boost by ProductName grouping.

If you run the fifth code that contains a column, you create the following nonclustered index.














You think this index can be helpful for your queries because this index contains the information required for all queries except the ProductName column. And this index is sorted by ProductID, but you still worry that the ProductID from one table and select information comes from another table.

You go through SQL Server Management Studio, by looking at the execution plan, see the previous data warehouse that the buddy wants the query for execution plan 1 shown.

Figure 1: Execution plan when grouping by Product.name

First you can be amazed at the product Name index of the product table,product.ak_product_name is not used. Then you realize that there is a unique index on the Product.name column and on the Product.productid, which makes SQL Server aware that the two columns are unique. Therefore, group by name is equivalent to group by ProductID. This makes a product a group.

Therefore, the query optimizer realizes that your query is equivalent to the following query, and these two ProductID indexes support join and group operations on the query.






GROUP by ProductID

SQL Server scans both the overlay index and the clustered index on SalesOrderDetail, both of which are sorted by ProductID. Therefore, using a merge connection without sorting or hashing, SQL Server generates the most efficient query plan.

If you drop the Product.ak_product_name index, for example:








The resulting new execution plan is less effective and requires additional sorting and merging of the connection operations.

Figure 2: Execution plan for queries grouped by product name when the index is dropped

As you can see, although the primary function of a unique index is to ensure the integrity of the data, it can also help the query optimizer generate a better query plan, even if the index itself is not used to access the data.

Summarize

The unique index provides a constraint for the primary key and the candidate key. A unique index can exist without a unique constraint, or vice versa.

A unique index can also be a filtered index, which allows a unique index to allow multiple null values in a column.

The IGNORE_DUP_KEY keyword can affect BULK INSERT statements.

A unique index can also provide better performance, since the unique index itself is not used for data access.

"Translated" SQL Server Introduction eighth: Unique index

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.