How to create an index

Source: Internet
Author: User
Tags filegroup sql server query rounds
ReprintedBytes

Index
You can use indexes to quickly access specific information in database tables. An index is a structure that sorts the values of one or more columns in a database table (for example, the last name (lname) column of the employee table. If you want to search for him or her by the name of a specific employee, the index will help you get information faster than searching all rows in the table.
Indexes provide pointers to the data values of the specified columns stored in the table, and sort these pointers according to the specified sorting order. The database uses an index in a similar way to a book directory: searches for an index to find a specific value, and then follows the pointer to the row containing the value.
In the database graph, you can create, edit, or delete each index type on the index/Key Attribute page for the selected table. When the table attached to the index or the database relationship diagram containing the table is saved, the index is also saved. For more information, see create an index.
Generally, you must create an index on the table only when you frequently query data in the index column. Indexes occupy disk space and speed up adding, deleting, and updating rows. However, in most cases, the data retrieval speed advantage brought by indexes is much higher than its disadvantage. However, if the application updates data frequently or the disk space is limited, it is best to limit the number of indexes.

Index type
Based on the functions of the database, you can create three types of indexes in the Database Designer: unique index, primary key index, and clustered index.
PromptAlthough the unique index helps to find information, we recommend that you use the primary key constraint or unique constraint for optimal performance.
Unique Index
The unique index does not allow two rows to have the same index value.
Primary Key Index
A database table usually has a column or a combination of columns. Its values are used to uniquely identify each row in the table. This column is called the primary key of the table.
Defining a primary key for a table in the database relationship diagram will automatically create a primary key index, which is a special type of unique index. The primary key index requires that each value in the primary key be unique.
Clustered Index
In the clustered index, the physical order of each row in the table is the same as the logic (INDEX) Order of key values. A table can contain only one clustered index.
If it is not a clustered index, the physical sequence of each row in the table does not match the logical sequence of the key value. Clustered indexes provide faster data access speeds than non-clustered indexes.
In Microsoft? SQL Server? You can create clustered indexes in the database. In clustered indexes, the physical order of each row in the table is the same as that of the index key value. A table can contain only one clustered index. Clustered indexes usually speed up the update and delete operations because these two operations need to read a large amount of data. It may take a long time to create or modify a clustered index, because the two operations must be performed on the disk to reorganize the rows of the table.
You can consider using clustered indexes:
1: A column that contains a limited number of unique values. For example, the State column only contains 50 unique state codes.
2: Use the following operators to return a query of range values: Between,>, >=, <, and <=.
3: returns a query of a large result set. (From Microsoft? SQL Server? Help)
Clustered indexes are particularly effective for columns that frequently search for range values. When a clustered index is used to locate the row that contains the first value, you can ensure that the row that contains the subsequent index value is physically adjacent. For example, if an application executes a query that often retrieves records within a certain date range, the clustered index can be used to quickly locate the rows containing the start date, and then retrieve all adjacent rows in the table, until the end date is reached. This helps improve the performance of such queries. Similarly, if a column is frequently used to sort the data retrieved from a table, the table can be clustered (physically sorted) on the column ), this avoids sorting this column each time it is queried, thus saving costs.
When the index value is unique, it is efficient to use clustered indexes to find specific rows. For example, the quickest way to use the unique employee ID column emp_id to find a specific employee is to create a clustered index or primary key constraint on the emp_id column.
Before creating a clustered index, you should first understand how your data is accessed. You can consider using clustered indexes:
1. A column that contains a large number of non-repeated values.
2. Use the following operators to return the query of a range value: Between,>, >=, <, and <=.
3. columns that are continuously accessed.
4. Returns a query of a large result set.
5 columns frequently accessed by queries using join or group by clauses. Generally, these columns are foreign key columns. Index the columns specified in the order by or group by clause, so that SQL server does not have to sort the data because these rows have been sorted. This improves the query performance.
6. OLTP-type applications that require fast single-row Lookup (usually through primary keys ). Create a clustered index on the primary key.
Clustered indexes are not applicable:
1. columns that are frequently changed will move the entire row (because SQL server must physically retain the data values in the row ). Pay special attention to this because the data in the transaction processing system with large data volumes is easy to lose.
The 2-width key is used by all non-clustered indexes as the search key. Therefore, it is stored in the leaf entries of each non-clustered index.
Description
If no clustered index is created for the table and no non-clustered index is specified when the primary key constraint is created, the primary key constraint automatically creates a clustered index.
Notes
It is important to define the clustered index key as few columns as possible. If a large clustered index key is defined, any non-clustered index defined on the same table will be much larger because the non-clustered index entries contain the clustered key. When the SQL script is saved to a disk with insufficient space, no error is returned in the index Optimization Wizard.

Create Index
Creates an index for a given table or view.

Only the owner of a table or view can create an index for the table. The owner of a table or view can create an index at any time, regardless of whether the table has data. You can create an index for a table or view in another database by specifying a qualified database name.

Syntax
Create [unique] [clustered | nonclustered] indexindex_name
On {table | view} (column [ASC | DESC] [,... n])
[With <index_option> [,... n]
[On filegroup]

<Index_option >::=
{Pad_index |
Fillfactor = fillfactor |
Ignore_dup_key |
Drop_existing |
Statistics_norecompute |
Sort_in_tempdb
}

Parameters
Unique

Create a unique index for a table or view (two rows with the same index value are not allowed ). The clustered index on the view must be a unique index.

When creating an index, if the data already exists, Microsoft sqlserver&#8482; checks whether duplicate values exist and performs this check every time data is added using the insert or update statement. If duplicate key values exist, the create index statement is canceled and an error message is returned, indicating the first duplicate value. When a unique index is created, multiple null values are considered as copies.

If a unique index exists, the update or insert statement that generates duplicate key values will be rolled back, and the SQL server will display an error message. This can happen even if the update or insert statement changes many rows but only produces one duplicate value. If data is input with a unique index and the ignore_dup_key clause is specified, only rows that violate the unique index will fail. Ignore_dup_key does not work when processing the update statement.

SQL Server does not allow creating unique indexes for columns that already contain duplicate values, regardless of whether ignore_dup_key is set. If you try to do this, SQL Server displays an error message. duplicate values must be deleted before you can create a unique index for these columns.

Clustered

Create an object where the physical sorting of rows is the same as that of the index, and the lowest level (leaf level) of the clustered index contains the actual data rows. A table or view can have only one clustered index at a time.

A view with clustered indexes is called an index view. You must create a unique clustered index for the view before defining other indexes for the view.

Create a clustered index before creating any non-clustered index. When creating a clustered index, recreate the existing non-clustered index on the table.

If clustered is not specified, a non-clustered index is created.

The clustering index has the same leaf level as its data page according to the definition, therefore, when you create a clustered index, The on filegroup clause will actually move the table from the file used when the table is created to the new file group. Before creating a table or index on a specific file group, check which file groups are available and have sufficient space for the index. The size of the file group must be at least 1.2 times the size required for the entire table.

Nonclustered

Creates a logical sorting object for the specified table. For non-clustered indexes, the physical sorting of rows is independent of the index sorting. The leaf level of a non-clustered index contains the index rows. Each index row contains a non-clustered key value and one or more row locators (pointing to rows containing this value ). If the table has no clustered index, the row locator is the disk address of the row. If a table has a clustered index, the row locator is the clustered index key of the row.

Each table can have a maximum of 249 non-clustered indexes (no matter how these non-clustered indexes are created: using primary key and unique constraints for implicit creation or using CREATE index for explicit creation ). Each index can provide access to data in different sorting order.

For an index view, you can only create a non-clustered index for a view that has a clustered index defined. Therefore, the row positioner of non-clustered indexes in the index view must be the clustering key of rows.

Index_name

Is the index name. The index name must be unique in the table or view, but not in the database. The index name must follow the identifier rules.

Table

The table that contains the columns for which the index is to be created. You can specify the database and table owner.

View

The name of the view to which the index is to be created. You must use schemabinding to define a view to create an index on The View. View definitions must also be deterministic. If all expressions, where, and group by clauses in the selection list are deterministic, the view is also deterministic. Furthermore, all key columns must be accurate. Only a non-key column of a view may contain a floating point expression (using a float data expression), and a float expression cannot be used anywhere in the view.

To search for columns in the deterministic view, use the columnproperty function (isdeterministic attribute ). The isprecise attribute of this function can be used to determine whether the key column is accurate.

You must first create a unique clustered index for the view to create a non-clustered index for the view.

In SQL Server Enterprise or development edition, the query optimizer can use the index view to speed up query execution. To make the optimizer consider replacing the view, you do not need to reference the view in the query.

When creating an index view or performing operations on the rows in the Table participating in the index view, there are seven set options that must be specified. Set options: arithabort, concat_null_yields_null, quoted_identifier, ansi_nulls, ansi_padding, and ansi_warning must be on. Set option numeric_roundabort must be off.

If it is different from the preceding settings, the data modification statements (insert, update, and delete) executed on any table referenced by the index view will fail, and the SQL server will display an error message, lists All set options that violate the settings. In addition, for select statements involving the index view, if the value of any set option is not the required value, SQL server does not consider replacing the index view when processing the SELECT statement. This ensures that the query results are correct when the set option is affected.

If the application uses a DB-library connection, you must assign the required values to all seven set options on the server. (By default, all set options except arithabort are set correctly for oledb and ODBC connections .)

If not all of the preceding set options have the required values, some operations (such as BCP, replication, or distributed queries) may not be able to update the tables involved in the index view. In most cases, setting arithabort to on can avoid this problem.

We strongly recommend that you create the first index view or index on the computing column in any database on the server and set the arithabort user option to on as early as possible within the server range.

For more information about the notes and restrictions of the index view, see the Notes section.

Column

Apply the index column. Specify two or more column names to create a composite index for the combined values of the specified column. List the columns to be included in the composite index (sorted by order priority) in the parentheses after the table ).

It indicates that columns composed of ntext, text, or image data types cannot be specified as index columns. In addition, a view cannot contain any text, ntext, or image columns, even if they are not referenced in the create index statement.

When two or more columns are best searched as one unit, or many queries reference only the specified columns in the index, a composite index should be used. A maximum of 16 columns can be combined into a composite index. All columns in the composite index must be in the same table. The maximum size of the combined index value is 900 bytes. That is to say, the total length of a fixed-size column that compose a composite index cannot exceed 900 bytes. For more information about variable type columns in a composite index, see the comments section.

[ASC | DESC]

Determines the ascending or descending direction of an index column. The default value is ASC.

N

It indicates that multiple columns placeholders can be specified for a specific index.

Pad_index

Specify the reserved space for each page (node) in the middle of the index. The pad_index option is useful only when fillfactor is specified, because pad_index uses the percentage specified by fillfactor. By default, given the key set on the intermediate page, SQL Server ensures that the available space on each index page can accommodate at least one of the largest rows allowed by the index. If the percentage specified for fillfactor is not large enough to accommodate a row, sqlserver replaces the percentage with the allowed minimum internally.

This indicates that the number of rows on the intermediate index page will never be less than two, no matter how small the fillfactor value is.

Fillfactor = fillfactor

Specify the page-level filling of each index during the SQL Server Index creation process. If an index page is filled up, SQL Server must take the time to split the index page to free up space for new rows, which requires a great deal of overhead. For tables with frequent updates, selecting an appropriate fillfactor value is better than selecting an inappropriate fillfactor value. The original value of fillfactor will be stored with the index in sysindexes.

If fillfactor is specified, SQL Server rounds up the number of rows to be placed on each page. For example, if you issue create clusteredindex... fillfactor = 33, a clustered index with fillfactor 33% will be created. Assume that SQL Server calculates 33% rows of space per page. SQL Server rounds it up so that six lines are placed on each page.

It indicates that the explicit fillfactor setting is only applied when the index is created for the first time. SQL Server does not dynamically retain the specified percentage of available space on the page.

The fillfactor value specified by the user can range from 1 to 100. If no value is specified, the default value is 0. If fillfactor is set to 0, only leaf pages are filled. You can change the default fillfactor settings by executing sp_configure.

Fillfactor 100 can be used only when no insert or update statement appears (for example, for a read-only table. If fillfactor is 100, SQL Server will create a leaf-level page 100% filled index. If you execute insert or update after creating an index with fillfactor of 100%, the page will be split for each insert operation and every update operation.

If the fillfactor value is small (except 0), SQL Server will create a new index with incomplete leaf pages. For example, if we know that the data contained in a table is only a small part of the final data to be included in the table, it is reasonable to set fillfactor to 10 when creating an index for the table. If the fillfactor value is small, the index occupies a large amount of storage space.

The following table describes how to fill the index page with fillfactor specified.

Fillfactor intermediate page leaf page
0 fill up an available item 100%
1%-99 one available item <= fillfactor % fill
100% one available item 100% fill

One available item refers to the space on the page that can accommodate another index item.

It is important to use a fillfactor value to create a clustered index, which affects the amount of storage space occupied by data, because SQL Server re-distributes data when creating a clustered index.

Ignore_dup_key

Controls what happens when you try to insert duplicate key values to columns that belong to a unique clustered index. If you specify ignore_dup_key for the index and execute the insert statement that creates the duplicate key, SQL server sends a warning message and ignores duplicate rows.

If the ignore_dup_key is not specified for the index, SQL server sends a warning message and rolls back the entire insert statement.

The following table shows when ignore_dup_key can be used.

Index type options
Clustering not allowed
Unique aggregation allows the use of ignore_dup_key
Non-clustering is not allowed
Unique non-clustering allows ignore_dup_key

Drop_existing

Specify that you should remove and recreate the named existing clustered index or non-clustered index. The specified index name must be the same as the existing index name. Because a non-clustered index contains a clustered key, a non-clustered index must be rebuilt when the clustered index is removed. If clustered indexes are re-built, non-clustered indexes must be re-built to use the new key set.

When a clustered index is created for a table that already has a non-clustered index (the same or different key sets are used), The drop_existing clause can improve the performance. The drop_existing clause replaces the execution of the drop index Statement on the old clustered index, and then the execution of the create index statement on the new clustered index. Non-clustered indexes only need to be rebuilt once, and are only required when keys are different.

If the key is not changed (the provided index name and column are the same as the original index), The drop_existing clause will not sort the data again. This is useful when indexes must be compressed.

You cannot use the drop_existing clause to convert a clustered index to a non-clustered index. However, you can change a unique clustered index to a non-unique index, and vice versa.

When the create index statement with the drop_existing clause is executed, SQL Server assumes that the index is consistent (that is, the index is not damaged ). The rows in the specified index should be sorted by the specified key referenced in the create index statement.

Statistics_norecompute

The specified expired index statistics are not automatically recalculated. To restore automatic update statistics, You can execute updatestatistics without the norecompute clause.

Important If you disable automatic re-calculation of distribution statistics, the SQL Server Query Optimizer may be prevented from selecting the best execution plan for queries involving the table.

Sort_in_tempdb

The intermediate sorting result used to generate the index is stored in the tempdb database. If tempdb and user database are not in the same disk set, this option may reduce the time required to create an index, but will increase the disk space used to create an index.

For more information, see tempdb and index creation.

On filegroup

Creates a specified index on a given filegroup. The file group must have been created by executing create database or alterdatabase.

Note
When a table or index is allocated space, an extended Disk Area (8 8 KB pages) is increased each time ). Each time you fill in an extended disk area, one will be assigned. If a table is very small or empty, its index will be allocated on a single page until eight pages are added to the index, and then the extended Disk Area is allocated. To obtain a Report on the amount of space allocated and occupied by the index, use sp_spaceused.

Creating clustered indexes requires that the available space in the database be approximately 1.2 times the data size. This space does not include the space occupied by existing tables. Data will be copied to create a clustered index. The old unindexed data will be deleted after the index is created. When the drop_existing clause is used, the space required for clustered indexes is the same as that for existing indexes. The extra space required may also be affected by the specified fillfactor.

When creating an index in SQL Server 2000, you can use the sort_in_tempdb option to indicate that the database engine stores intermediate index sorting results in tempdb. If tempdb is different from the disk set where the user database is located, this option may reduce the time required to create an index, but will increase the disk space used to create an index. In addition to the space required to create an index in the user database, tempdb must have approximately the same additional space to store intermediate sorting results. For more information, see tempdb and index creation.

The create index statement is optimized like other queries. The SQL Server Query processor can scan another index instead of executing a table scan to save I/O operations. In some cases, sorting is not required.

On a multi-processor computer running SQL Server Enterprise Manager and programmer edition, create Index automatically uses multiple processors to perform scanning and sorting, which is the same as other queries. The number of processors used to execute a create index statement is determined by the configuration option maxdegree of parallelism and the current workload. If SQL server detects that the system is busy, the concurrency of the create index operation is automatically reduced before the statement is executed.

All file groups affected by the create index statement since the last file group backup must be backed up as one unit. For more information about file and file group backup, see backup.

The backup and create index operations do not affect each other. If a backup is in progress, an index is created in the full logging mode, which may require additional log space.

To display reports about object indexes, execute sp_helpindex.

You can create indexes for temporary tables. When a table is removed or the session is terminated, all indexes and triggers are removed.

Variable type columns in the Index
The maximum size of the index key is 900 bytes. However, SQL Server 2000 allows you to create an index on a column that may contain a large number of variable types of columns. The maximum size of these columns exceeds 900 bytes.

When creating an index, SQL Server checks the following conditions:

The total length of all fixed data columns involved in index definition must be less than or equal to 900 bytes. When the index to be created is composed of only fixed data columns, the total size of the fixed data column must be less than or equal to 900 bytes. Otherwise, you cannot create an index and SQL server returns an error.

If the index definition consists of Fixed-type columns and variable-type columns, and the fixed data column meets the preceding conditions (less than or equal to 900 bytes ), the SQL Server still needs to check the total size of a variable type column. If the maximum size of a variable type column and the sum of the fixed data column size is greater than 900 bytes, SQL Server creates an index. However, a warning message is returned to remind you: if the total size of a subsequent insertion or update operation on a variable type column exceeds 900 bytes, the Operation will fail and the user will receive a runtime error. Similarly, if the index definition only consists of Variable-type columns and the maximum size of these columns is greater than 900 bytes, sqlserver will create an index, but a warning message will be returned.
For more information, see the maximum value of the index key.

Considerations when creating an index on a calculated column and view
In SQL Server 2000, you can also create indexes on computing columns and views. Creating a unique clustered index on a view can improve query performance, because the view is stored in the database in the same way as the table with clustered indexes.

As long as the unique or primary key meets all the index conditions, it can contain the computed column. Specifically, the calculated column must be deterministic, accurate, and cannot contain text, ntext, or image columns. For more information about certainty, see deterministic and non-deterministic functions.

Creating an index on a calculated column or view may cause the previous insert or update operation to fail. This error may occur when the calculation column causes an arithmetic error. For example, although column C in the following table may cause arithmetic errors, the insert statement is still valid:

Create Table T1 (A int, B INT, C as A/B)
Go
Insert into T1ValueS ('1', '0 ')
Go

On the contrary, if an index is created on column C after a table is created, the insert statement fails.

Create Table T1 (A int, B INT, C as A/B)
Go
Create unique clustered index idx1 on t1.c
Go
Insert into T1ValueS ('1', '0 ')
Go

The query results obtained by using indexes on a view defined by a number or float expression may be different from the results obtained by similar queries that do not use indexes on The View. This difference may be caused by rounding errors during insert, delete, or update operations on the base table.

To prevent SQL Server from using the index view, include the option (expand views) Prompt in the query. In addition, incorrect settings of any of the listed options will prevent the optimizer from using the indexes on The View. For more information about option (expand views) prompts, see select.

Restrictions on the index View
The SELECT statement defining the index view cannot contain the top, distinct, compute, having, and Union keywords. Nor can it contain subqueries.

The select list cannot contain asterisks (*), 'table. * 'wildcard list, distinct, count (*), count (<expression>), calculation columns, and scalar aggregation in the base table.

The non-aggregate select list cannot contain expressions. The aggregate select list (queries containing group by) may contain sum and count_big (<expression>); it must contain count_big (*). Other Aggregate functions (Min, Max, STDev,...) are not allowed ,...).

Complex aggregation using AVG cannot participate in the select list of the indexed view. However, if the query uses such aggregation, the optimizer will be able to use this index view and replace AVG with a simple aggregation combination of sum and count_big.

If a column is obtained from a float data type or an expression that uses a float expression for value, it cannot be used as the index key of the index view or computed column in the table. Such columns are considered inaccurate. Use the columnproperty function to determine whether the columns in a specific calculation column or view are accurate.

The index view is subject to the following additional restrictions:

The creator of the index must own the table. All tables, views, and indexes must be created in the same database.

Select statements defining the index view cannot contain views, row set functions, in-row functions, or derived tables. The same physical table can only appear once in this statement.

In any join table, the outer join operation is not allowed.

Subquery, contains, or freetext predicates cannot be used in search conditions.

If a view definition contains a group by clause, the select list of the view must contain all groups based on the column and count_big (*) expression. In addition, the create unique clustered index clause must contain only these columns.
The definition subject of a view that can create an index must be deterministic and accurate, which is similar to the index requirement of a computing column. For more information, see create an index on a calculated column.

Permission
The create index permission is granted to SysAdmin fixed server roles, db_ddladmin and db_owner fixed database roles and table owners by default, and cannot be transferred.

Example
A. Use simple Indexes
The following example creates an index for the au_id column of the authors table.

Set nocount off
Use pubs
If exists (Select name from sysindexes
Where name = 'au _ id_ind ')
 Drop index authors. au_id_ind
Go
Use pubs
Create inde

I also hope to organize the principles of index building and make a FAQ.
Let's talk about the problematic index. We can look at the indexing principles in turn:
1. There is no index on the table, resulting in full table scanning.
2. The index creation table is too small and there is no need to create an index.
3. Each row of data in the index corresponds to too many rows in the table, such as creating an index for gender.
4. For fields that appear in the WHERE clause but are always used with functions or operators to create indexes, such indexes do not work.
5. Too many indexes on the table, especially in the OLTP system, resulting in slow insertion.
6. The index item is too large.
7. unreasonable compound indexes.

I want to write a few articles first.
Note: indexes must be used with SQL statement optimization.

Trackback: http://tb.donews.net/TrackBack.aspx? Postid = 468076

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.