Database index Introduction (reprint)

Source: Internet
Author: User

In fact, you can interpret an index as a special kind of directory. Microsoft SQL Server provides two types of indexes: Clustered indexes (clustered index, also known as clustered indexes, clustered indexes), and nonclustered indexes (nonclustered index, also called nonclustered indexes, non-clustered indexes). Let's look at the difference between clustered and nonclustered indexes, for example:
In fact, the body of our Chinese dictionary is itself a clustered index. For example, we have to check the word "Ann", it will be very natural to open the first few pages of the dictionary, because "ann" Pinyin is "an", and alphabetical order of Chinese characters in the dictionary is the English letter "a" beginning and "Z", then the word "Ann" naturally ranked in the front of the dictionary. If you have turned over all the parts that begin with "a" and still cannot find the word, then it means that you do not have the word in your dictionary, and if you look up the word "Zhang", you will also turn your dictionary into the last part, because the pinyin of "Zhang" is "Zhang". That is, the body part of the dictionary is itself a directory, and you do not need to look up other directories to find what you need to find.
We refer to this body of content itself as a directory of certain rules, called a clustered index.
If you know a word, you can quickly find the word from the auto. But you may also encounter the words you do not know, do not understand its pronunciation, at this time, you can not follow the method to find the word you want to check, and need to go to the "radicals" to find the word you are looking for, and then according to the page number after the word directly to a page to find the word you are looking for. But the sort of words you find in combination with the "radicals" and "gept" is not really the sort method of the body, for example, you check the word "Zhang", we can see in the Gept table after the Radicals "Zhang" page number is 672 pages, gept table "Zhang" above is "Chi" word, but the page number is 63 pages, "Zhang" below is "crossbow "Word, page is 390 pages. Obviously, these words are not really in the "Zhang" the word of the upper and lower side, now you see the continuous "Chi, Zhang, crossbow" three words is actually their order in the nonclustered index, is the dictionary body of words in the non-clustered index mapping. We can find the words you need in this way, but it takes two procedures to find the results in the catalog and then turn to the page numbers you need.
We put this directory purely as a directory, and the body is purely a sort of body of text called a nonclustered index.
Through the above example, we can understand what is clustered index and nonclustered index.
Further, we can easily understand that each table can have only one clustered index, because the catalog can only be sorted in one way.

Reprint: http://www.php100.com/html/webkaifa/database/Mysql/2009/0104/1150.html


Why do you create an index? This is because creating an index can greatly improve the performance of the system.
First, by creating a unique index, you can guarantee the uniqueness of each row of data in a database table.
Second, it can greatly speed up the retrieval of data, which is the main reason for creating indexes.
Thirdly, the connection between tables and tables can be accelerated, particularly in terms of achieving referential integrity of the data.
Finally, when using grouping and sorting clauses for data retrieval, you can also significantly reduce the time to group and sort in queries.
By using the index, we can improve the performance of the system by using the optimized hidden device in the process of querying.

Perhaps someone will ask: there are so many advantages to adding indexes, why not create an index for each column in the table? Although this kind of thought has its rationality, but also has its one-sidedness. Although indexes have many advantages, it is very unwise to add indexes to each column in a table. This is because there is a lot of downside to increasing the index.

First, it takes time to create indexes and maintain indexes, and this time increases as the amount of data increases.
Second, the index needs to occupy the physical space, in addition to the data table to occupy the data space, each index also occupies a certain amount of physical space, if you want to establish a clustered index, then the space will be larger.
Thirdly, when the data in the table is added, deleted and modified, the index should be maintained dynamically, thus reducing the maintenance speed of the data.

Indexes are built on top of some columns in a database table. Therefore, when you create an index, you should carefully consider which columns you can create an index on, and on which columns you cannot create an index. In general, indexes should be created on these columns, for example:

You can speed up your search on columns that you often need to search for;
On the column that is the primary key, enforce the uniqueness of the column and the arrangement of the data in the organization table;
These columns are mostly foreign keys and can speed up the connection in the columns that are often used in the connection;
Create an index on a column that often needs to be searched by scope, because the index is sorted and its specified range is continuous;
Create indexes on columns that often need to be sorted, because the indexes are sorted so that the query can use the sorting of the indexes to speed up the sorting query time;
It is often used to create an index above the column in the WHERE clause to speed up the judgment of the condition.

Similarly, indexes should not be created for some columns. In general, these columns that should not be indexed have the following characteristics:

First, the index should not be created for columns that are seldom used or referenced in queries. This is because, since these columns are seldom used, they are indexed or non-indexed and do not improve query speed. Conversely, by increasing the index, it reduces the system maintenance speed and increases the space requirement.
Second, you should not increase the index for columns that have only a few data values. This is because, because these columns have very few values, such as the gender column of the personnel table, in the results of the query, the data rows of the result set occupy a large proportion of the data rows in the table, that is, the data rows that need to be searched in the table are large. Increasing the index does not significantly speed up the retrieval.
Third, for those columns defined as text, the image and bit data types should not be indexed. This is because the amount of data in these columns is either quite large or has very little value.
The index should not be created when the performance of the modification is far greater than the retrieval performance. This is because modifying performance and retrieving performance are conflicting. When you increase the index, the retrieval performance is improved, but the performance of the modification is reduced. When you reduce the index, you increase the performance of the modification and reduce the retrieval performance. Therefore, you should not create an index when the performance of the modification is far greater than the retrieval performance.

Methods for creating indexes and characteristics of indexes
How to create an index
There are several ways to create indexes, including methods for creating indexes directly and indirectly creating indexes. Create indexes directly, such as by using the CREATE INDEX statement or by creating the Index wizard indirectly, such as when you define a PRIMARY key constraint or a uniqueness key constraint in a table, and you also create an index. Although both of these methods can create indexes, there are differences in the specifics of how they create indexes.
Using the CREATE INDEX statement or creating an index using the Make Indexing Wizard is the most basic way to create an index, and this method is most flexible and can be customized to create an index that fits your needs. When you create an index this way, you can use a number of options, such as specifying the fill level of the data page, sorting, collating statistics, and so on, which optimizes the index. Using this approach, you can specify the type, uniqueness, and composition of the index, that is, you can create either a clustered index or a nonclustered index, either by creating an index on one column or by creating an index on two or more than two columns.

You can also create an index indirectly by defining a primary KEY constraint or a uniqueness key constraint. A PRIMARY KEY constraint is a logic that preserves data integrity, which restricts records in the table to have the same primary key record. When you create a PRIMARY key constraint, the system automatically creates a unique clustered index. Although, logically, the primary KEY constraint is an important structure, on the physical structure, the structure corresponding to the primary KEY constraint is a unique clustered index. In other words, on a physical implementation, there is no primary KEY constraint, and only a unique clustered index exists. Similarly, when creating a Uniqueness key constraint, an index is created at the same time, and the index is a unique, non-clustered index. As a result, when creating an index with constraints, the type and characteristics of the index are basically determined, and the user-defined scope is relatively small.

When a primary key or uniqueness key constraint is defined on a table, if a standard index created using the CREATE INDEX statement is already in the table, the index created by the PRIMARY KEY constraint or uniqueness key constraint overrides the previously created standard index. That is, the index created by the PRIMARY KEY constraint or uniqueness key constraint is higher than the index created with the CREATE INDEX statement.

Characteristics of the Index
The index has two characteristics, that is, the uniqueness index and the composite index.
A uniqueness index guarantees that all data in the indexed column is unique and does not contain redundant data. If there is already a primary KEY constraint or uniqueness key constraint in the table, SQL server automatically creates a unique index when the table is created or when the table is modified. However, if uniqueness must be guaranteed, you should create a PRIMARY key constraint or a uniqueness key constraint instead of creating a unique index. When you create a uniqueness index, you should carefully consider these rules: when you create a PRIMARY KEY constraint or a uniqueness key constraint in a table, SQL Server automatically creates a unique index, and if the table already contains data, the SQL Server checks the data redundancy in the table when you create the index Whenever you insert data using an INSERT statement or modify data using a modification statement, SQL Server checks the redundancy of the data: if there is a redundant value, SQL Server cancels the execution of the statement and returns an error message, ensuring that each row of data in the table has a unique value. This ensures that each entity can be uniquely acknowledged, and only a unique index can be created on columns that guarantee entity integrity, for example, you cannot create a unique index on a name column in a personnel table because people can have the same name.

A composite index is an index that is created on two or more columns. When searching, when two or more columns are a key value, it is best to create composite indexes on those columns. When you create a composite index, you should consider these rules: You can combine up to 16 columns into a single composite index, the total length of the columns that make up the composite index cannot exceed 900 bytes, which means that the composite column length cannot be too long; In a composite index, all columns must be from the same table, and composite columns cannot be created across tables In a composite index, the order of the columns is very important, so the order of the columns is carefully arranged, in principle, the most unique column should be defined first, for example, the index on (col1,col2) is not the same as the index on (col2,col1), because the order of the two-indexed columns is different ; for the query optimizer to use a composite index, the WHERE clause in the query statement must refer to the first column in the composite index, which is useful when there are multiple key columns in the table; Using composite indexes can improve query performance and reduce the number of indexes created in a table.

Reprint: http://blog.csdn.net/pang040328/article/details/4164874

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.