Importance of Data Indexing

Source: Internet
Author: User
Tags sql server management studio
ArticleDirectory
    • Response results without Indexes
    • Results
    • Use SQL
    • Use tools such as SQL Server Management Studio
    • Basic indexing concepts
    • Indexes and conditions
    • The following table lists the index types available in SQL Server 2005.
    • Basic concepts of index design
    • Material cost considerations
    • Check quota considerations
Preface

It is very embarrassing to say that I learned three years of data, and I learned the importance of data index for the first time yesterday, recently, because the system was very slow during display, it took 10 seconds to select 1 hour from 70 million data records, after a long time, I found that the bottle was on the item list, but at that time I did not know how to handle it, on the contrary, yesterday, the customer asked me if you have any index. I said yes (because an index will be created at the same time when the primarykey is created ), he said that he had also encountered performance problems before. After the index was created, it improved a lot. In this case, I added the index to the data rows USED IN THE WHERE clause, the SELECT statement of the same condition is faster than 10 to times. The original response takes 10 seconds. Now, the statement is completed in less than 1 second, today, I have collected some index information and made a simple example. Now I want to share it with others.

 

Difference between indexes and Indexes

Zookeeper platform: SQL Server 2005 Express

Tool: SQL Server Profiler

Microsoft Visual Studio team System 2008 database Edition

Data volume: 100 million RMB

Metadata table: Customs (ID, name ,.....)

Limit limit method: Select * from Customs where name like '% limit %' (2491 results)

Microsoft Visual Studio team System 2008 database Edition

Response results without Indexes

Response result: 49 seconds 21

Results

Response result: 4 seconds 52

The time difference is nearly ten times.

 

How to create an index

You can use SQL or tools to create indexes.

Use SQL

Create nonclustered index ix_customs on DBO. Customs
(
Name
)

Use tools such as SQL Server Management Studio

When you use SQL Server Management studio to optimize data tables, The Menu displays the data table design tool, which selects index/index indexes, the index/index creation tool will appear to explain the problem.

2. Data Table Design Tools

✓ Three-index/indexing tools

 

The next session describes how to use Database Engine Tuning Advisor to analyze information and provide index recommendations.

 

What is an index (retrieved from msdn)

Just like Indexing in local documents, indexing in data lists allows you to quickly find specific information in a data table or index indexes. An index contains an index created from one or more data rows in a data table or dataset, and contains the index that corresponds to the location of a specified data volume. You can create well-designed indexes to support queries to greatly improve the efficiency of resource query and application. The index can retrieve the amount of data in the result set only when necessary. The index can also be unique to the data column of the data table to ensure the integrity of the data table.

 

Basic indexing concepts

An index is a disk storage structure related to a data table or volume. It speeds up retrieving data columns from the data table or volume. An index contains an index created from one or more data rows in a data table or table. These indexes are in the structure (B-type indexes ), so that SQL server can quickly and effectively find one or more data columns related to index limit values.

Data Tables or metadata tables can contain the following types of indexes:

    • Replica set

      • The dataset index sorts and stores the data columns in the data table or dataset according to their index values. These are the data rows contained in the index definition. Because data columns can only be sorted in a sort order, each data table can have only one index set.
      • Only when a data table contains a dataset index can the data columns in the table be stored in sorted order. When a data table has a dataset index, the data table is considered a dataset index. If a data table does not have any replica set indexes, its data columns are stored in an unordered structure, which is a heap structure.
    • Non-replica set
      • A non-dataset index has a structure completely separated from the data column. Non-dataset indexes contain non-dataset index metric values, and each index value category has a metric that points to the data column containing the index metric value.
      • The index column pointing to the item column from the index column of the non-dataset set is regarded as the item column locator. The structure of the data column locator must be determined by the existence of a heap or data set table. In the case of heap stacking, the data column locator is the index pointing to the data column. If the dataset is a data table, the data column locator is the dataset index.
      • In SQL Server 2005, you can add the index-free row to the sharding indexes of non-dataset indexes, in order to avoid the restrictions on indexes (900-bit organizations and 16 index indexing rows), and perform exact index queries. If you need to renew your resources.

The replica set and non-replica set indexes can be unique. This means that no two data columns can be indexed with the same value. Otherwise, the index is not unique, so that multiple data columns can share the same index value. If you need to renew your resources.

When you modify the data of a data table, the indexes of the data table or table are automatically modified.

 

Indexes and conditions

When the primary key and unique conditions are specified on the Data row of the data table, the index is automatically created. For example, when you create a data table and recognize specific data rows as the primary index, SQL Server 2005 database engine automatically creates a primary key condition and index for the data row.

 

The following table lists the index types available in SQL Server 2005.
index type description

dataset

the index sets are sorted in the sort order of the root dataset index indexes and the data columns of the stored data table or metadata. The replica SET index will actually be a B-type replica SET index structure, and the data column is quickly retrieved based on their replica SET index values.

non-dataset

non-replica set indexes can be defined in the data tables, metadata tables, or heap indexes with replica set indexes. Each index data column in a non-dataset index contains a non-dataset value and a data column positioner. This locator points to the data column in the replica SET index or heap shard that contains the metric value. The data columns in the index are stored in sequence according to the index value. However, unless the index of the index set is created in the data table, otherwise, the data columns may not follow any specific sort order.

unique

the unique index can ensure that the index does not contain duplicate values. Therefore, each data column in the data table or dataset is unique in some aspects.

both the distinct set and non-distinct set indexes can be unique.

index with internal data rows

In addition to indexing metadata rows, indexing is a non-dataset index that contains non-indexed metadata rows.

index comment

when the index in the response is made into a specific row, the response and result set are permanently stored in the unique response SET index, the memory storage method is the same as that of the data table with the dataset index. After the replica SET index is created, you can add non-replica set indexes in replica set.

full-text indexing

a special type of token function index, which is created and operated by Microsoft full-text engine for SQL Server (msftesql. It can effectively search character strings.

XML

in an XML Binary Large Object (BLOB) representation that is divided and maintained in an XML data row.

 

Basic concepts of index design

Poor design indexes and insufficient indexes are the main cause of the Application of program bottle indexes in the data library. Designing effective indexes is one of the most important aspects of improving the performance of good information and applications. When resources are correctly indexed and working hours, it is often difficult to strike a balance between the query speed and the update cost. A small index, or an index contains a small amount of data rows. The required disk space is smaller than the actual volume. On the contrary, if the index is too large, more queries can be included. Before finding the most efficient index, you may need to write several different designs. Indexes can be added, modified, and detached, but do not affect the structure description or design of the application. Therefore, do not apply to different indexes.

Using indexes does not necessarily have good performance. Using indexes with good performance and efficiency cannot lead to better performance. If indexes are used to produce the best performance, the query optimization tool is simple. But in fact, the index with incorrect selection may not have the best performance.

 

The following are the policies proposed by developers for designing indexes:

    1. Understand the characteristics of the resource itself. For example, is this an OLTP resource on the transaction processing page that often modifies resources? Or is there a Decision Support System (DSS) or an information processing platform (OLAP) that mainly contains only information?
    2. Find out the most commonly used query features. For example, knowing that the most commonly used query results in more than two Data Tables will help determine the optimal type index to be used.
    3. Understand the data row characteristics used for querying the metadata. For example, an index is the most ideal method for data rows with integer data types that are unique or non-null.
    4. When you create or optimize an index, you can choose an index that may improve the efficiency. For example, selecting an online index can help you create a dataset index on existing large data tables. Online selection allows users to perform and perform operations on the base material at the same time of index creation or re-indexing.
    5. Determine the ideal index storage location. Non-replica set indexes can be used as the base data table metadata in the same case group, or in different case groups. By increasing the disk I/O efficiency, the index storage location can improve the query efficiency. For example, a non-replica SET index is stored in a replica set on a different disk (different from a data table partition, multiple disks can be acquired at the same time, so the efficiency can be improved.

In addition, dataset and non-dataset set indexes can be used to split configurations across multiple case groups. This allows you to quickly and effectively access or manage data subsets and segment large data tables or indexes that are easier to manage when the integrity of the entire collection is fully realized. When you use segmentation, decide whether to use the same method of splitting the data table, or Vertical Split.

 

Material cost considerations
    • When the index quantity in the data table is too large, it will affect the efficiency of the insert, update, and delete statement, because as long as the data in the data table changes, all indexes must also be rounded up.

      • Avoid over-indexing of frequently-updated data tables to keep the index narrow, and the fewer data rows, the better.
      • Multiple indexes can be used for data tables that are not updated frequently but have a large amount of data to increase the query efficiency. A large number of indexes can help you not modify the query efficiency of data. For example, the SELECT statement is used to check the optimal chemical industry. There are many indexes available for selection, in order to determine the fastest access method.
    • Creating an index for a small data table is not the best way, because the query optimization tool uses the index to search for data, it will be more time-consuming than the metadata table of the original row statement. Therefore, indexing on small data tables is rarely used, and it is necessary to write data when the data table changes.
    • When an index includes a summary, a data table Summary, or a combination of a summary and summary, creating an index on the index can provide significant performance gains. It is not necessary to explicitly test the optimization tool during the checking process. It will be used by the checking optimization tool.

 

Check quota considerations
    • Create a non-replica SET index on all the data rows that are frequently used in terms and used for query conditions.

      • Important: avoid adding unnecessary data rows. Adding too many index data rows may affect disk space and indexing efficiency.
    • Han index can increase the query efficiency, because the query results exist in the index itself, and all required resources meet the query requirements. That is to say, when retrieving the required information, you only need to index the metadata of the non-data table or dataset index. Therefore, you can reduce the volume I/O. For example, a data table has a combined index on the, B, and C data rows, the row A and B can retrieve the specified data from the index.
    • The created query statement may insert or modify the maximum number of data columns in one statement, instead of using multiple queries to update the same data column. Only one statement can be used to optimize the index indexing method.
    • Evaluate and query metadata types and how to use data rows in the dataset. For example, the data row used in the exact match Query type is equivalent to a non-replica set or replica SET index.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.